You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Pierre ANCELOT <pi...@gmail.com> on 2010/05/18 14:06:24 UTC

Any possible to set hdfs block size to a value smaller than 64MB?

Hi,
I'm porting a legacy application to hadoop and it uses a bunch of small
files.
I'm aware that having such small files ain't a good idea but I'm not doing
the technical decisions and the port has to be done for yesterday...
Of course such small files are a problem, loading 64MB blocks for a few
lines of text is an evident loss.
What will happen if I set a smaller, or even way smaller (32kB) blocks?

Thank you.

Pierre ANCELOT.

Re: Any possible to set hdfs block size to a value smaller than 64MB?

Posted by Pierre ANCELOT <pi...@gmail.com>.

... and by slices of 64MB then I mean...
?

On Tue, May 18, 2010 at 2:38 PM, Pierre ANCELOT <pi...@gmail.com> wrote:

> Hi, thanks for this fast answer :)
> If so, what do you mean by blocks? If a file has to be splitted, it will be
> splitted when larger than 64MB?
>
>
>
>
>
> On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman <bb...@cse.unl.edu>wrote:
>
>> Hey Pierre,
>>
>> These are not traditional filesystem blocks - if you save a file smaller
>> than 64MB, you don't lose 64MB of file space..
>>
>> Hadoop will use 32KB to store a 32KB file (ok, plus a KB of metadata or
>> so), not 64MB.
>>
>> Brian
>>
>> On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:
>>
>> > Hi,
>> > I'm porting a legacy application to hadoop and it uses a bunch of small
>> > files.
>> > I'm aware that having such small files ain't a good idea but I'm not
>> doing
>> > the technical decisions and the port has to be done for yesterday...
>> > Of course such small files are a problem, loading 64MB blocks for a few
>> > lines of text is an evident loss.
>> > What will happen if I set a smaller, or even way smaller (32kB) blocks?
>> >
>> > Thank you.
>> >
>> > Pierre ANCELOT.
>>
>>
>
>
> --
> http://www.neko-consulting.com
> Ego sum quis ego servo
> "Je suis ce que je protège"
> "I am what I protect"
>
>


-- 
http://www.neko-consulting.com
Ego sum quis ego servo
"Je suis ce que je protège"
"I am what I protect"

Re: Any possible to set hdfs block size to a value smaller than 64MB?

Posted by Konstantin Boudnik <co...@yahoo-inc.com>.

I had an experiment with block size of 10 bytes (sic!). This was _very_ slow
on NN side. Like writing 5 Mb was happening for 25 minutes or so :( No fun to
say the least...

On Tue, May 18, 2010 at 10:56AM, Konstantin Shvachko wrote:
> You can also get some performance numbers and answers to the block size dilemma problem here:
> 
> http://developer.yahoo.net/blogs/hadoop/2010/05/scalability_of_the_hadoop_dist.html
> 
> I remember some people were using Hadoop for storing or streaming videos.
> Don't know how well that worked.
> It would be interesting to learn about your experience.
> 
> Thanks,
> --Konstantin
> 
> 
> On 5/18/2010 8:41 AM, Brian Bockelman wrote:
> > Hey Hassan,
> >
> > 1) The overhead is pretty small, measured in a small number of milliseconds on average
> > 2) HDFS is not designed for "online latency".  Even though the average is small, if something "bad happens", your clients might experience a lot of delays while going through the retry stack.  The initial design was for batch processing, and latency-sensitive applications came later.
> >
> > Additionally since the NN is a SPOF, you might want to consider your uptime requirements.  Each organization will have to balance these risks with the advantages (such as much cheaper hardware).
> >
> > There's a nice interview with the GFS authors here where they touch upon the latency issues:
> >
> > http://queue.acm.org/detail.cfm?id=1594206
> >
> > As GFS and HDFS share many design features, the theoretical parts of their discussion might be useful for you.
> >
> > As far as overall throughput of the system goes, it depends heavily upon your implementation and hardware.  Our HDFS routinely serves 5-10 Gbps.
> >
> > Brian
> >
> > On May 18, 2010, at 10:29 AM, Nyamul Hassan wrote:
> >
> >> This is a very interesting thread to us, as we are thinking about deploying
> >> HDFS as a massive online storage for a on online university, and then
> >> serving the video files to students who want to view them.
> >>
> >> We cannot control the size of the videos (and some class work files), as
> >> they will mostly be uploaded by the teachers providing the classes.
> >>
> >> How would the overall through put of HDFS be affected in such a solution?
> >> Would HDFS be feasible at all for such a setup?
> >>
> >> Regards
> >> HASSAN
> >>
> >>
> >>
> >> On Tue, May 18, 2010 at 21:11, He Chen<ai...@gmail.com>  wrote:
> >>
> >>> If you know how to use AspectJ to do aspect oriented programming. You can
> >>> write a aspect class. Let it just monitors the whole process of MapReduce
> >>>
> >>> On Tue, May 18, 2010 at 10:00 AM, Patrick Angeles<patrick@cloudera.com
> >>>> wrote:
> >>>
> >>>> Should be evident in the total job running time... that's the only metric
> >>>> that really matters :)
> >>>>
> >>>> On Tue, May 18, 2010 at 10:39 AM, Pierre ANCELOT<pierreact@gmail.com
> >>>>> wrote:
> >>>>
> >>>>> Thank you,
> >>>>> Any way I can measure the startup overhead in terms of time?
> >>>>>
> >>>>>
> >>>>> On Tue, May 18, 2010 at 4:27 PM, Patrick Angeles<patrick@cloudera.com
> >>>>>> wrote:
> >>>>>
> >>>>>> Pierre,
> >>>>>>
> >>>>>> Adding to what Brian has said (some things are not explicitly
> >>> mentioned
> >>>>> in
> >>>>>> the HDFS design doc)...
> >>>>>>
> >>>>>> - If you have small files that take up<  64MB you do not actually use
> >>>> the
> >>>>>> entire 64MB block on disk.
> >>>>>> - You *do* use up RAM on the NameNode, as each block represents
> >>>> meta-data
> >>>>>> that needs to be maintained in-memory in the NameNode.
> >>>>>> - Hadoop won't perform optimally with very small block sizes. Hadoop
> >>>> I/O
> >>>>> is
> >>>>>> optimized for high sustained throughput per single file/block. There
> >>> is
> >>>> a
> >>>>>> penalty for doing too many seeks to get to the beginning of each
> >>> block.
> >>>>>> Additionally, you will have a MapReduce task per small file. Each
> >>>>> MapReduce
> >>>>>> task has a non-trivial startup overhead.
> >>>>>> - The recommendation is to consolidate your small files into large
> >>>> files.
> >>>>>> One way to do this is via SequenceFiles... put the filename in the
> >>>>>> SequenceFile key field, and the file's bytes in the SequenceFile
> >>> value
> >>>>>> field.
> >>>>>>
> >>>>>> In addition to the HDFS design docs, I recommend reading this blog
> >>>> post:
> >>>>>> http://www.cloudera.com/blog/2009/02/the-small-files-problem/
> >>>>>>
> >>>>>> Happy Hadooping,
> >>>>>>
> >>>>>> - Patrick
> >>>>>>
> >>>>>> On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT<pierreact@gmail.com
> >>>>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Okay, thank you :)
> >>>>>>>
> >>>>>>>
> >>>>>>> On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman<
> >>>> bbockelm@cse.unl.edu
> >>>>>>>> wrote:
> >>>>>>>
> >>>>>>>>
> >>>>>>>> On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote:
> >>>>>>>>
> >>>>>>>>> Hi, thanks for this fast answer :)
> >>>>>>>>> If so, what do you mean by blocks? If a file has to be
> >>> splitted,
> >>>> it
> >>>>>>> will
> >>>>>>>> be
> >>>>>>>>> splitted when larger than 64MB?
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> For every 64MB of the file, Hadoop will create a separate block.
> >>>> So,
> >>>>>> if
> >>>>>>>> you have a 32KB file, there will be one block of 32KB.  If the
> >>> file
> >>>>> is
> >>>>>>> 65MB,
> >>>>>>>> then it will have one block of 64MB and another block of 1MB.
> >>>>>>>>
> >>>>>>>> Splitting files is very useful for load-balancing and
> >>> distributing
> >>>>> I/O
> >>>>>>>> across multiple nodes.  At 32KB / file, you don't really need to
> >>>>> split
> >>>>>>> the
> >>>>>>>> files at all.
> >>>>>>>>
> >>>>>>>> I recommend reading the HDFS design document for background
> >>> issues
> >>>>> like
> >>>>>>>> this:
> >>>>>>>>
> >>>>>>>> http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html
> >>>>>>>>
> >>>>>>>> Brian
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman<
> >>>>>> bbockelm@cse.unl.edu
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Hey Pierre,
> >>>>>>>>>>
> >>>>>>>>>> These are not traditional filesystem blocks - if you save a
> >>> file
> >>>>>>> smaller
> >>>>>>>>>> than 64MB, you don't lose 64MB of file space..
> >>>>>>>>>>
> >>>>>>>>>> Hadoop will use 32KB to store a 32KB file (ok, plus a KB of
> >>>>> metadata
> >>>>>>> or
> >>>>>>>>>> so), not 64MB.
> >>>>>>>>>>
> >>>>>>>>>> Brian
> >>>>>>>>>>
> >>>>>>>>>> On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Hi,
> >>>>>>>>>>> I'm porting a legacy application to hadoop and it uses a
> >>> bunch
> >>>> of
> >>>>>>> small
> >>>>>>>>>>> files.
> >>>>>>>>>>> I'm aware that having such small files ain't a good idea but
> >>>> I'm
> >>>>>> not
> >>>>>>>>>> doing
> >>>>>>>>>>> the technical decisions and the port has to be done for
> >>>>>> yesterday...
> >>>>>>>>>>> Of course such small files are a problem, loading 64MB blocks
> >>>> for
> >>>>> a
> >>>>>>> few
> >>>>>>>>>>> lines of text is an evident loss.
> >>>>>>>>>>> What will happen if I set a smaller, or even way smaller
> >>> (32kB)
> >>>>>>> blocks?
> >>>>>>>>>>>
> >>>>>>>>>>> Thank you.
> >>>>>>>>>>>
> >>>>>>>>>>> Pierre ANCELOT.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> http://www.neko-consulting.com
> >>>>>>>>> Ego sum quis ego servo
> >>>>>>>>> "Je suis ce que je protц╗ge"
> >>>>>>>>> "I am what I protect"
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> http://www.neko-consulting.com
> >>>>>>> Ego sum quis ego servo
> >>>>>>> "Je suis ce que je protц╗ge"
> >>>>>>> "I am what I protect"
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> http://www.neko-consulting.com
> >>>>> Ego sum quis ego servo
> >>>>> "Je suis ce que je protц╗ge"
> >>>>> "I am what I protect"
> >>>>>
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Best WishesО╪│
> >>> И║╨И─│Е∙├Г╔╨О╪│
> >>>
> >>> О╪█О╪█
> >>> Chen He
> >>> (402)613-9298
> >>> PhD. student of CSE Dept.
> >>> Holland Computing Center
> >>> University of Nebraska-Lincoln
> >>> Lincoln NE 68588
> >>>
> >
>

Re: Any possible to set hdfs block size to a value smaller than 64MB?

Posted by Nyamul Hassan <mn...@usa.net>.

Thank you Brian and Konstantin for the two very interesting reads.  I had to
read some lines quite a few times to get the proper understanding of what
they tried to explain.  That interview was a class of its own.

It does seem that Google has heavily used BigTable to overcome some of the
shortcomings of the GFS.  Is that the same as what Konstantin is referring
to HBase over HDFS?

Regards
HASSAN



On Thu, May 20, 2010 at 04:11, Konstantin Shvachko <sh...@yahoo-inc.com>wrote:

> Hi Brian,
>
> Interesting observations.
> This is probably in line with the "client side mount table" approach, [soon
> to be] proposed in HDFS-1053.
> Another way to provide personalized view of the file system would be to use
> symbolic links, which is
> available now in 0.21/22.
> For 4K files I would probably use h-archives, especially if the data, as
> you describe it, is not changing.
> But high-energy physicist should be considering using HBase over HDFS.
>
> --konst
>
>
>
> On 5/18/2010 11:57 AM, Brian Bockelman wrote:
>
>>
>> Hey Konstantin,
>>
>> Interesting paper :)
>>
>> One thing which I've been kicking around lately is "at what scale does the
>> file/directory paradigm break down?"
>>
>> At some point, I think the human mind can no longer comprehend so many
>> files (certainly, I can barely organize the few thousand files on my
>> laptop).  Thus, file growth comes from (a) having lots of humans use a
>> single file system or (b) automated programs generate the files.  For (a),
>> you don't need a central global namespace, you just need the ability to have
>> a "local" namespace per-person that can be shared among friends.  For (b), a
>> program isn't going to be upset if you replace a file system with a database
>> / dataset object / bucket.
>>
>> Two examples:
>> - structural biology: I've seen a lot of different analysis workflows
>> (such as autodock) that compares a protein against a "database" of ligands,
>> where the database is 80,000 O(4KB) files.  Each file represents a known
>> ligand that the biologist might come back and examine if it is relevant to
>> the study of their protein.
>> - high-energy physics: Each detector can produce millions of a events a
>> night, and experiments will produce many billions of events.  These are
>> saved into files (each file containing hundreds or thousands of events);
>> these files are kept in collections (called datasets, data blocks, lumi
>> sections, or runs, depending on what you're doing).  Tasks are run against
>> datasets; they will output smaller datasets which the physicists will
>> iterate upon until they get some dataset which fits onto their laptop.
>> Here's my Claim: The biologists have a small enough number of objects to
>> manage each one as a separate file; they do this because it's easier for
>> humans navigating around in a terminal.  The physicists have such a huge
>> number of objects that there's no way to manage them using one file per
>> object, so they utilize files only as a mechanism to serialize bytes of data
>> and have higher-order data structures for management
>> Here's my Question: at what point do you move from the biologist's model
>> (named objects, managed independently, single files) to the physicist's
>> model (anonymous objects, managed in large groups, files are only used
>> because we save data on file systems)?
>>
>> Another way to look at this is to consider DNS.  DNS maintains the
>> namespace of the globe, but appears to do this just fine without a single
>> central catalog.  If you start with a POSIX filesystem namespace (and the
>> guarantees it implies), what rules must you relax in order to arrive at DNS?
>>  On the scale of managing million (billion? ten billion? trillion?) files,
>> are any of the assumptions relevant?
>>
>> I don't know the answers to these questions, but I suspect they become
>> important over the next 10 years.
>>
>> Brian
>>
>> PS - I starting thinking along these lines during MSST when the LLNL guy
>> was speculating about what it meant to "fsck" a file system with 1 trillion
>> files.
>>
>> On May 18, 2010, at 12:56 PM, Konstantin Shvachko wrote:
>>
>>  You can also get some performance numbers and answers to the block size
>>> dilemma problem here:
>>>
>>>
>>> http://developer.yahoo.net/blogs/hadoop/2010/05/scalability_of_the_hadoop_dist.html
>>>
>>> I remember some people were using Hadoop for storing or streaming videos.
>>> Don't know how well that worked.
>>> It would be interesting to learn about your experience.
>>>
>>> Thanks,
>>> --Konstantin
>>>
>>>
>>> On 5/18/2010 8:41 AM, Brian Bockelman wrote:
>>>
>>>> Hey Hassan,
>>>>
>>>> 1) The overhead is pretty small, measured in a small number of
>>>> milliseconds on average
>>>> 2) HDFS is not designed for "online latency".  Even though the average
>>>> is small, if something "bad happens", your clients might experience a lot of
>>>> delays while going through the retry stack.  The initial design was for
>>>> batch processing, and latency-sensitive applications came later.
>>>>
>>>> Additionally since the NN is a SPOF, you might want to consider your
>>>> uptime requirements.  Each organization will have to balance these risks
>>>> with the advantages (such as much cheaper hardware).
>>>>
>>>> There's a nice interview with the GFS authors here where they touch upon
>>>> the latency issues:
>>>>
>>>> http://queue.acm.org/detail.cfm?id=1594206
>>>>
>>>> As GFS and HDFS share many design features, the theoretical parts of
>>>> their discussion might be useful for you.
>>>>
>>>> As far as overall throughput of the system goes, it depends heavily upon
>>>> your implementation and hardware.  Our HDFS routinely serves 5-10 Gbps.
>>>>
>>>> Brian
>>>>
>>>> On May 18, 2010, at 10:29 AM, Nyamul Hassan wrote:
>>>>
>>>>  This is a very interesting thread to us, as we are thinking about
>>>>> deploying
>>>>> HDFS as a massive online storage for a on online university, and then
>>>>> serving the video files to students who want to view them.
>>>>>
>>>>> We cannot control the size of the videos (and some class work files),
>>>>> as
>>>>> they will mostly be uploaded by the teachers providing the classes.
>>>>>
>>>>> How would the overall through put of HDFS be affected in such a
>>>>> solution?
>>>>> Would HDFS be feasible at all for such a setup?
>>>>>
>>>>> Regards
>>>>> HASSAN
>>>>>
>>>>>
>>>>>
>>>>> On Tue, May 18, 2010 at 21:11, He Chen<ai...@gmail.com>   wrote:
>>>>>
>>>>>  If you know how to use AspectJ to do aspect oriented programming. You
>>>>>> can
>>>>>> write a aspect class. Let it just monitors the whole process of
>>>>>> MapReduce
>>>>>>
>>>>>> On Tue, May 18, 2010 at 10:00 AM, Patrick Angeles<
>>>>>> patrick@cloudera.com
>>>>>>
>>>>>>> wrote:
>>>>>>>
>>>>>>
>>>>>>  Should be evident in the total job running time... that's the only
>>>>>>> metric
>>>>>>> that really matters :)
>>>>>>>
>>>>>>> On Tue, May 18, 2010 at 10:39 AM, Pierre ANCELOT<pierreact@gmail.com
>>>>>>>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>
>>>>>>>  Thank you,
>>>>>>>> Any way I can measure the startup overhead in terms of time?
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, May 18, 2010 at 4:27 PM, Patrick Angeles<
>>>>>>>> patrick@cloudera.com
>>>>>>>>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>
>>>>>>>>  Pierre,
>>>>>>>>>
>>>>>>>>> Adding to what Brian has said (some things are not explicitly
>>>>>>>>>
>>>>>>>> mentioned
>>>>>>
>>>>>>> in
>>>>>>>>
>>>>>>>>> the HDFS design doc)...
>>>>>>>>>
>>>>>>>>> - If you have small files that take up<   64MB you do not actually
>>>>>>>>> use
>>>>>>>>>
>>>>>>>> the
>>>>>>>
>>>>>>>> entire 64MB block on disk.
>>>>>>>>> - You *do* use up RAM on the NameNode, as each block represents
>>>>>>>>>
>>>>>>>> meta-data
>>>>>>>
>>>>>>>> that needs to be maintained in-memory in the NameNode.
>>>>>>>>> - Hadoop won't perform optimally with very small block sizes.
>>>>>>>>> Hadoop
>>>>>>>>>
>>>>>>>> I/O
>>>>>>>
>>>>>>>> is
>>>>>>>>
>>>>>>>>> optimized for high sustained throughput per single file/block.
>>>>>>>>> There
>>>>>>>>>
>>>>>>>> is
>>>>>>
>>>>>>> a
>>>>>>>
>>>>>>>> penalty for doing too many seeks to get to the beginning of each
>>>>>>>>>
>>>>>>>> block.
>>>>>>
>>>>>>> Additionally, you will have a MapReduce task per small file. Each
>>>>>>>>>
>>>>>>>> MapReduce
>>>>>>>>
>>>>>>>>> task has a non-trivial startup overhead.
>>>>>>>>> - The recommendation is to consolidate your small files into large
>>>>>>>>>
>>>>>>>> files.
>>>>>>>
>>>>>>>> One way to do this is via SequenceFiles... put the filename in the
>>>>>>>>> SequenceFile key field, and the file's bytes in the SequenceFile
>>>>>>>>>
>>>>>>>> value
>>>>>>
>>>>>>> field.
>>>>>>>>>
>>>>>>>>> In addition to the HDFS design docs, I recommend reading this blog
>>>>>>>>>
>>>>>>>> post:
>>>>>>>
>>>>>>>> http://www.cloudera.com/blog/2009/02/the-small-files-problem/
>>>>>>>>>
>>>>>>>>> Happy Hadooping,
>>>>>>>>>
>>>>>>>>> - Patrick
>>>>>>>>>
>>>>>>>>> On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT<
>>>>>>>>> pierreact@gmail.com
>>>>>>>>>
>>>>>>>>
>>>>>>>  wrote:
>>>>>>>>>
>>>>>>>>>  Okay, thank you :)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman<
>>>>>>>>>>
>>>>>>>>> bbockelm@cse.unl.edu
>>>>>>>
>>>>>>>>  wrote:
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote:
>>>>>>>>>>>
>>>>>>>>>>>  Hi, thanks for this fast answer :)
>>>>>>>>>>>> If so, what do you mean by blocks? If a file has to be
>>>>>>>>>>>>
>>>>>>>>>>> splitted,
>>>>>>
>>>>>>> it
>>>>>>>
>>>>>>>> will
>>>>>>>>>>
>>>>>>>>>>> be
>>>>>>>>>>>
>>>>>>>>>>>> splitted when larger than 64MB?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> For every 64MB of the file, Hadoop will create a separate block.
>>>>>>>>>>>
>>>>>>>>>> So,
>>>>>>>
>>>>>>>> if
>>>>>>>>>
>>>>>>>>>> you have a 32KB file, there will be one block of 32KB.  If the
>>>>>>>>>>>
>>>>>>>>>> file
>>>>>>
>>>>>>> is
>>>>>>>>
>>>>>>>>> 65MB,
>>>>>>>>>>
>>>>>>>>>>> then it will have one block of 64MB and another block of 1MB.
>>>>>>>>>>>
>>>>>>>>>>> Splitting files is very useful for load-balancing and
>>>>>>>>>>>
>>>>>>>>>> distributing
>>>>>>
>>>>>>> I/O
>>>>>>>>
>>>>>>>>> across multiple nodes.  At 32KB / file, you don't really need to
>>>>>>>>>>>
>>>>>>>>>> split
>>>>>>>>
>>>>>>>>> the
>>>>>>>>>>
>>>>>>>>>>> files at all.
>>>>>>>>>>>
>>>>>>>>>>> I recommend reading the HDFS design document for background
>>>>>>>>>>>
>>>>>>>>>> issues
>>>>>>
>>>>>>> like
>>>>>>>>
>>>>>>>>> this:
>>>>>>>>>>>
>>>>>>>>>>> http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html
>>>>>>>>>>>
>>>>>>>>>>> Brian
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman<
>>>>>>>>>>>>
>>>>>>>>>>> bbockelm@cse.unl.edu
>>>>>>>>>
>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>  Hey Pierre,
>>>>>>>>>>>>>
>>>>>>>>>>>>> These are not traditional filesystem blocks - if you save a
>>>>>>>>>>>>>
>>>>>>>>>>>> file
>>>>>>
>>>>>>>  smaller
>>>>>>>>>>
>>>>>>>>>>> than 64MB, you don't lose 64MB of file space..
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hadoop will use 32KB to store a 32KB file (ok, plus a KB of
>>>>>>>>>>>>>
>>>>>>>>>>>> metadata
>>>>>>>>
>>>>>>>>> or
>>>>>>>>>>
>>>>>>>>>>> so), not 64MB.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Brian
>>>>>>>>>>>>>
>>>>>>>>>>>>> On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>  Hi,
>>>>>>>>>>>>>> I'm porting a legacy application to hadoop and it uses a
>>>>>>>>>>>>>>
>>>>>>>>>>>>> bunch
>>>>>>
>>>>>>> of
>>>>>>>
>>>>>>>> small
>>>>>>>>>>
>>>>>>>>>>>  files.
>>>>>>>>>>>>>> I'm aware that having such small files ain't a good idea but
>>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm
>>>>>>>
>>>>>>>> not
>>>>>>>>>
>>>>>>>>>>  doing
>>>>>>>>>>>>>
>>>>>>>>>>>>>> the technical decisions and the port has to be done for
>>>>>>>>>>>>>>
>>>>>>>>>>>>> yesterday...
>>>>>>>>>
>>>>>>>>>>  Of course such small files are a problem, loading 64MB blocks
>>>>>>>>>>>>>>
>>>>>>>>>>>>> for
>>>>>>>
>>>>>>>> a
>>>>>>>>
>>>>>>>>> few
>>>>>>>>>>
>>>>>>>>>>>  lines of text is an evident loss.
>>>>>>>>>>>>>> What will happen if I set a smaller, or even way smaller
>>>>>>>>>>>>>>
>>>>>>>>>>>>> (32kB)
>>>>>>
>>>>>>>  blocks?
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>> Thank you.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Pierre ANCELOT.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> http://www.neko-consulting.com
>>>>>>>>>>>> Ego sum quis ego servo
>>>>>>>>>>>> "Je suis ce que je protège"
>>>>>>>>>>>> "I am what I protect"
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> http://www.neko-consulting.com
>>>>>>>>>> Ego sum quis ego servo
>>>>>>>>>> "Je suis ce que je protège"
>>>>>>>>>> "I am what I protect"
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> http://www.neko-consulting.com
>>>>>>>> Ego sum quis ego servo
>>>>>>>> "Je suis ce que je protège"
>>>>>>>> "I am what I protect"
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best Wishes！
>>>>>> 顺送商祺！
>>>>>>
>>>>>> －－
>>>>>> Chen He
>>>>>> (402)613-9298
>>>>>> PhD. student of CSE Dept.
>>>>>> Holland Computing Center
>>>>>> University of Nebraska-Lincoln
>>>>>> Lincoln NE 68588
>>>>>>
>>>>>>
>>>>
>>
>
>

Re: Any possible to set hdfs block size to a value smaller than 64MB?

Posted by Konstantin Shvachko <sh...@yahoo-inc.com>.

Hi Brian,

Interesting observations.
This is probably in line with the "client side mount table" approach, [soon to be] proposed in HDFS-1053.
Another way to provide personalized view of the file system would be to use symbolic links, which is
available now in 0.21/22.
For 4K files I would probably use h-archives, especially if the data, as you describe it, is not changing.
But high-energy physicist should be considering using HBase over HDFS.

--konst


On 5/18/2010 11:57 AM, Brian Bockelman wrote:
>
> Hey Konstantin,
>
> Interesting paper :)
>
> One thing which I've been kicking around lately is "at what scale does the file/directory paradigm break down?"
>
> At some point, I think the human mind can no longer comprehend so many files (certainly, I can barely organize the few thousand files on my laptop).  Thus, file growth comes from (a) having lots of humans use a single file system or (b) automated programs generate the files.  For (a), you don't need a central global namespace, you just need the ability to have a "local" namespace per-person that can be shared among friends.  For (b), a program isn't going to be upset if you replace a file system with a database / dataset object / bucket.
>
> Two examples:
> - structural biology: I've seen a lot of different analysis workflows (such as autodock) that compares a protein against a "database" of ligands, where the database is 80,000 O(4KB) files.  Each file represents a known ligand that the biologist might come back and examine if it is relevant to the study of their protein.
> - high-energy physics: Each detector can produce millions of a events a night, and experiments will produce many billions of events.  These are saved into files (each file containing hundreds or thousands of events); these files are kept in collections (called datasets, data blocks, lumi sections, or runs, depending on what you're doing).  Tasks are run against datasets; they will output smaller datasets which the physicists will iterate upon until they get some dataset which fits onto their laptop.
> Here's my Claim: The biologists have a small enough number of objects to manage each one as a separate file; they do this because it's easier for humans navigating around in a terminal.  The physicists have such a huge number of objects that there's no way to manage them using one file per object, so they utilize files only as a mechanism to serialize bytes of data and have higher-order data structures for management
> Here's my Question: at what point do you move from the biologist's model (named objects, managed independently, single files) to the physicist's model (anonymous objects, managed in large groups, files are only used because we save data on file systems)?
>
> Another way to look at this is to consider DNS.  DNS maintains the namespace of the globe, but appears to do this just fine without a single central catalog.  If you start with a POSIX filesystem namespace (and the guarantees it implies), what rules must you relax in order to arrive at DNS?  On the scale of managing million (billion? ten billion? trillion?) files, are any of the assumptions relevant?
>
> I don't know the answers to these questions, but I suspect they become important over the next 10 years.
>
> Brian
>
> PS - I starting thinking along these lines during MSST when the LLNL guy was speculating about what it meant to "fsck" a file system with 1 trillion files.
>
> On May 18, 2010, at 12:56 PM, Konstantin Shvachko wrote:
>
>> You can also get some performance numbers and answers to the block size dilemma problem here:
>>
>> http://developer.yahoo.net/blogs/hadoop/2010/05/scalability_of_the_hadoop_dist.html
>>
>> I remember some people were using Hadoop for storing or streaming videos.
>> Don't know how well that worked.
>> It would be interesting to learn about your experience.
>>
>> Thanks,
>> --Konstantin
>>
>>
>> On 5/18/2010 8:41 AM, Brian Bockelman wrote:
>>> Hey Hassan,
>>>
>>> 1) The overhead is pretty small, measured in a small number of milliseconds on average
>>> 2) HDFS is not designed for "online latency".  Even though the average is small, if something "bad happens", your clients might experience a lot of delays while going through the retry stack.  The initial design was for batch processing, and latency-sensitive applications came later.
>>>
>>> Additionally since the NN is a SPOF, you might want to consider your uptime requirements.  Each organization will have to balance these risks with the advantages (such as much cheaper hardware).
>>>
>>> There's a nice interview with the GFS authors here where they touch upon the latency issues:
>>>
>>> http://queue.acm.org/detail.cfm?id=1594206
>>>
>>> As GFS and HDFS share many design features, the theoretical parts of their discussion might be useful for you.
>>>
>>> As far as overall throughput of the system goes, it depends heavily upon your implementation and hardware.  Our HDFS routinely serves 5-10 Gbps.
>>>
>>> Brian
>>>
>>> On May 18, 2010, at 10:29 AM, Nyamul Hassan wrote:
>>>
>>>> This is a very interesting thread to us, as we are thinking about deploying
>>>> HDFS as a massive online storage for a on online university, and then
>>>> serving the video files to students who want to view them.
>>>>
>>>> We cannot control the size of the videos (and some class work files), as
>>>> they will mostly be uploaded by the teachers providing the classes.
>>>>
>>>> How would the overall through put of HDFS be affected in such a solution?
>>>> Would HDFS be feasible at all for such a setup?
>>>>
>>>> Regards
>>>> HASSAN
>>>>
>>>>
>>>>
>>>> On Tue, May 18, 2010 at 21:11, He Chen<ai...@gmail.com>   wrote:
>>>>
>>>>> If you know how to use AspectJ to do aspect oriented programming. You can
>>>>> write a aspect class. Let it just monitors the whole process of MapReduce
>>>>>
>>>>> On Tue, May 18, 2010 at 10:00 AM, Patrick Angeles<patrick@cloudera.com
>>>>>> wrote:
>>>>>
>>>>>> Should be evident in the total job running time... that's the only metric
>>>>>> that really matters :)
>>>>>>
>>>>>> On Tue, May 18, 2010 at 10:39 AM, Pierre ANCELOT<pierreact@gmail.com
>>>>>>> wrote:
>>>>>>
>>>>>>> Thank you,
>>>>>>> Any way I can measure the startup overhead in terms of time?
>>>>>>>
>>>>>>>
>>>>>>> On Tue, May 18, 2010 at 4:27 PM, Patrick Angeles<patrick@cloudera.com
>>>>>>>> wrote:
>>>>>>>
>>>>>>>> Pierre,
>>>>>>>>
>>>>>>>> Adding to what Brian has said (some things are not explicitly
>>>>> mentioned
>>>>>>> in
>>>>>>>> the HDFS design doc)...
>>>>>>>>
>>>>>>>> - If you have small files that take up<   64MB you do not actually use
>>>>>> the
>>>>>>>> entire 64MB block on disk.
>>>>>>>> - You *do* use up RAM on the NameNode, as each block represents
>>>>>> meta-data
>>>>>>>> that needs to be maintained in-memory in the NameNode.
>>>>>>>> - Hadoop won't perform optimally with very small block sizes. Hadoop
>>>>>> I/O
>>>>>>> is
>>>>>>>> optimized for high sustained throughput per single file/block. There
>>>>> is
>>>>>> a
>>>>>>>> penalty for doing too many seeks to get to the beginning of each
>>>>> block.
>>>>>>>> Additionally, you will have a MapReduce task per small file. Each
>>>>>>> MapReduce
>>>>>>>> task has a non-trivial startup overhead.
>>>>>>>> - The recommendation is to consolidate your small files into large
>>>>>> files.
>>>>>>>> One way to do this is via SequenceFiles... put the filename in the
>>>>>>>> SequenceFile key field, and the file's bytes in the SequenceFile
>>>>> value
>>>>>>>> field.
>>>>>>>>
>>>>>>>> In addition to the HDFS design docs, I recommend reading this blog
>>>>>> post:
>>>>>>>> http://www.cloudera.com/blog/2009/02/the-small-files-problem/
>>>>>>>>
>>>>>>>> Happy Hadooping,
>>>>>>>>
>>>>>>>> - Patrick
>>>>>>>>
>>>>>>>> On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT<pierreact@gmail.com
>>>>>>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Okay, thank you :)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman<
>>>>>> bbockelm@cse.unl.edu
>>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi, thanks for this fast answer :)
>>>>>>>>>>> If so, what do you mean by blocks? If a file has to be
>>>>> splitted,
>>>>>> it
>>>>>>>>> will
>>>>>>>>>> be
>>>>>>>>>>> splitted when larger than 64MB?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> For every 64MB of the file, Hadoop will create a separate block.
>>>>>> So,
>>>>>>>> if
>>>>>>>>>> you have a 32KB file, there will be one block of 32KB.  If the
>>>>> file
>>>>>>> is
>>>>>>>>> 65MB,
>>>>>>>>>> then it will have one block of 64MB and another block of 1MB.
>>>>>>>>>>
>>>>>>>>>> Splitting files is very useful for load-balancing and
>>>>> distributing
>>>>>>> I/O
>>>>>>>>>> across multiple nodes.  At 32KB / file, you don't really need to
>>>>>>> split
>>>>>>>>> the
>>>>>>>>>> files at all.
>>>>>>>>>>
>>>>>>>>>> I recommend reading the HDFS design document for background
>>>>> issues
>>>>>>> like
>>>>>>>>>> this:
>>>>>>>>>>
>>>>>>>>>> http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html
>>>>>>>>>>
>>>>>>>>>> Brian
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman<
>>>>>>>> bbockelm@cse.unl.edu
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hey Pierre,
>>>>>>>>>>>>
>>>>>>>>>>>> These are not traditional filesystem blocks - if you save a
>>>>> file
>>>>>>>>> smaller
>>>>>>>>>>>> than 64MB, you don't lose 64MB of file space..
>>>>>>>>>>>>
>>>>>>>>>>>> Hadoop will use 32KB to store a 32KB file (ok, plus a KB of
>>>>>>> metadata
>>>>>>>>> or
>>>>>>>>>>>> so), not 64MB.
>>>>>>>>>>>>
>>>>>>>>>>>> Brian
>>>>>>>>>>>>
>>>>>>>>>>>> On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>> I'm porting a legacy application to hadoop and it uses a
>>>>> bunch
>>>>>> of
>>>>>>>>> small
>>>>>>>>>>>>> files.
>>>>>>>>>>>>> I'm aware that having such small files ain't a good idea but
>>>>>> I'm
>>>>>>>> not
>>>>>>>>>>>> doing
>>>>>>>>>>>>> the technical decisions and the port has to be done for
>>>>>>>> yesterday...
>>>>>>>>>>>>> Of course such small files are a problem, loading 64MB blocks
>>>>>> for
>>>>>>> a
>>>>>>>>> few
>>>>>>>>>>>>> lines of text is an evident loss.
>>>>>>>>>>>>> What will happen if I set a smaller, or even way smaller
>>>>> (32kB)
>>>>>>>>> blocks?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thank you.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Pierre ANCELOT.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> http://www.neko-consulting.com
>>>>>>>>>>> Ego sum quis ego servo
>>>>>>>>>>> "Je suis ce que je protège"
>>>>>>>>>>> "I am what I protect"
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> http://www.neko-consulting.com
>>>>>>>>> Ego sum quis ego servo
>>>>>>>>> "Je suis ce que je protège"
>>>>>>>>> "I am what I protect"
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> http://www.neko-consulting.com
>>>>>>> Ego sum quis ego servo
>>>>>>> "Je suis ce que je protège"
>>>>>>> "I am what I protect"
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best Wishes！
>>>>> 顺送商祺！
>>>>>
>>>>> －－
>>>>> Chen He
>>>>> (402)613-9298
>>>>> PhD. student of CSE Dept.
>>>>> Holland Computing Center
>>>>> University of Nebraska-Lincoln
>>>>> Lincoln NE 68588
>>>>>
>>>
>

Re: Any possible to set hdfs block size to a value smaller than 64MB?

Posted by Brian Bockelman <bb...@cse.unl.edu>.

Hey Konstantin,

Interesting paper :)

One thing which I've been kicking around lately is "at what scale does the file/directory paradigm break down?"

At some point, I think the human mind can no longer comprehend so many files (certainly, I can barely organize the few thousand files on my laptop).  Thus, file growth comes from (a) having lots of humans use a single file system or (b) automated programs generate the files.  For (a), you don't need a central global namespace, you just need the ability to have a "local" namespace per-person that can be shared among friends.  For (b), a program isn't going to be upset if you replace a file system with a database / dataset object / bucket.

Two examples:
- structural biology: I've seen a lot of different analysis workflows (such as autodock) that compares a protein against a "database" of ligands, where the database is 80,000 O(4KB) files.  Each file represents a known ligand that the biologist might come back and examine if it is relevant to the study of their protein.
- high-energy physics: Each detector can produce millions of a events a night, and experiments will produce many billions of events.  These are saved into files (each file containing hundreds or thousands of events); these files are kept in collections (called datasets, data blocks, lumi sections, or runs, depending on what you're doing).  Tasks are run against datasets; they will output smaller datasets which the physicists will iterate upon until they get some dataset which fits onto their laptop.
Here's my Claim: The biologists have a small enough number of objects to manage each one as a separate file; they do this because it's easier for humans navigating around in a terminal.  The physicists have such a huge number of objects that there's no way to manage them using one file per object, so they utilize files only as a mechanism to serialize bytes of data and have higher-order data structures for management
Here's my Question: at what point do you move from the biologist's model (named objects, managed independently, single files) to the physicist's model (anonymous objects, managed in large groups, files are only used because we save data on file systems)?

Another way to look at this is to consider DNS.  DNS maintains the namespace of the globe, but appears to do this just fine without a single central catalog.  If you start with a POSIX filesystem namespace (and the guarantees it implies), what rules must you relax in order to arrive at DNS?  On the scale of managing million (billion? ten billion? trillion?) files, are any of the assumptions relevant?

I don't know the answers to these questions, but I suspect they become important over the next 10 years.

Brian

PS - I starting thinking along these lines during MSST when the LLNL guy was speculating about what it meant to "fsck" a file system with 1 trillion files.

On May 18, 2010, at 12:56 PM, Konstantin Shvachko wrote:

> You can also get some performance numbers and answers to the block size dilemma problem here:
> 
> http://developer.yahoo.net/blogs/hadoop/2010/05/scalability_of_the_hadoop_dist.html
> 
> I remember some people were using Hadoop for storing or streaming videos.
> Don't know how well that worked.
> It would be interesting to learn about your experience.
> 
> Thanks,
> --Konstantin
> 
> 
> On 5/18/2010 8:41 AM, Brian Bockelman wrote:
>> Hey Hassan,
>> 
>> 1) The overhead is pretty small, measured in a small number of milliseconds on average
>> 2) HDFS is not designed for "online latency".  Even though the average is small, if something "bad happens", your clients might experience a lot of delays while going through the retry stack.  The initial design was for batch processing, and latency-sensitive applications came later.
>> 
>> Additionally since the NN is a SPOF, you might want to consider your uptime requirements.  Each organization will have to balance these risks with the advantages (such as much cheaper hardware).
>> 
>> There's a nice interview with the GFS authors here where they touch upon the latency issues:
>> 
>> http://queue.acm.org/detail.cfm?id=1594206
>> 
>> As GFS and HDFS share many design features, the theoretical parts of their discussion might be useful for you.
>> 
>> As far as overall throughput of the system goes, it depends heavily upon your implementation and hardware.  Our HDFS routinely serves 5-10 Gbps.
>> 
>> Brian
>> 
>> On May 18, 2010, at 10:29 AM, Nyamul Hassan wrote:
>> 
>>> This is a very interesting thread to us, as we are thinking about deploying
>>> HDFS as a massive online storage for a on online university, and then
>>> serving the video files to students who want to view them.
>>> 
>>> We cannot control the size of the videos (and some class work files), as
>>> they will mostly be uploaded by the teachers providing the classes.
>>> 
>>> How would the overall through put of HDFS be affected in such a solution?
>>> Would HDFS be feasible at all for such a setup?
>>> 
>>> Regards
>>> HASSAN
>>> 
>>> 
>>> 
>>> On Tue, May 18, 2010 at 21:11, He Chen<ai...@gmail.com>  wrote:
>>> 
>>>> If you know how to use AspectJ to do aspect oriented programming. You can
>>>> write a aspect class. Let it just monitors the whole process of MapReduce
>>>> 
>>>> On Tue, May 18, 2010 at 10:00 AM, Patrick Angeles<patrick@cloudera.com
>>>>> wrote:
>>>> 
>>>>> Should be evident in the total job running time... that's the only metric
>>>>> that really matters :)
>>>>> 
>>>>> On Tue, May 18, 2010 at 10:39 AM, Pierre ANCELOT<pierreact@gmail.com
>>>>>> wrote:
>>>>> 
>>>>>> Thank you,
>>>>>> Any way I can measure the startup overhead in terms of time?
>>>>>> 
>>>>>> 
>>>>>> On Tue, May 18, 2010 at 4:27 PM, Patrick Angeles<patrick@cloudera.com
>>>>>>> wrote:
>>>>>> 
>>>>>>> Pierre,
>>>>>>> 
>>>>>>> Adding to what Brian has said (some things are not explicitly
>>>> mentioned
>>>>>> in
>>>>>>> the HDFS design doc)...
>>>>>>> 
>>>>>>> - If you have small files that take up<  64MB you do not actually use
>>>>> the
>>>>>>> entire 64MB block on disk.
>>>>>>> - You *do* use up RAM on the NameNode, as each block represents
>>>>> meta-data
>>>>>>> that needs to be maintained in-memory in the NameNode.
>>>>>>> - Hadoop won't perform optimally with very small block sizes. Hadoop
>>>>> I/O
>>>>>> is
>>>>>>> optimized for high sustained throughput per single file/block. There
>>>> is
>>>>> a
>>>>>>> penalty for doing too many seeks to get to the beginning of each
>>>> block.
>>>>>>> Additionally, you will have a MapReduce task per small file. Each
>>>>>> MapReduce
>>>>>>> task has a non-trivial startup overhead.
>>>>>>> - The recommendation is to consolidate your small files into large
>>>>> files.
>>>>>>> One way to do this is via SequenceFiles... put the filename in the
>>>>>>> SequenceFile key field, and the file's bytes in the SequenceFile
>>>> value
>>>>>>> field.
>>>>>>> 
>>>>>>> In addition to the HDFS design docs, I recommend reading this blog
>>>>> post:
>>>>>>> http://www.cloudera.com/blog/2009/02/the-small-files-problem/
>>>>>>> 
>>>>>>> Happy Hadooping,
>>>>>>> 
>>>>>>> - Patrick
>>>>>>> 
>>>>>>> On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT<pierreact@gmail.com
>>>>> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Okay, thank you :)
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman<
>>>>> bbockelm@cse.unl.edu
>>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote:
>>>>>>>>> 
>>>>>>>>>> Hi, thanks for this fast answer :)
>>>>>>>>>> If so, what do you mean by blocks? If a file has to be
>>>> splitted,
>>>>> it
>>>>>>>> will
>>>>>>>>> be
>>>>>>>>>> splitted when larger than 64MB?
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> For every 64MB of the file, Hadoop will create a separate block.
>>>>> So,
>>>>>>> if
>>>>>>>>> you have a 32KB file, there will be one block of 32KB.  If the
>>>> file
>>>>>> is
>>>>>>>> 65MB,
>>>>>>>>> then it will have one block of 64MB and another block of 1MB.
>>>>>>>>> 
>>>>>>>>> Splitting files is very useful for load-balancing and
>>>> distributing
>>>>>> I/O
>>>>>>>>> across multiple nodes.  At 32KB / file, you don't really need to
>>>>>> split
>>>>>>>> the
>>>>>>>>> files at all.
>>>>>>>>> 
>>>>>>>>> I recommend reading the HDFS design document for background
>>>> issues
>>>>>> like
>>>>>>>>> this:
>>>>>>>>> 
>>>>>>>>> http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html
>>>>>>>>> 
>>>>>>>>> Brian
>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman<
>>>>>>> bbockelm@cse.unl.edu
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Hey Pierre,
>>>>>>>>>>> 
>>>>>>>>>>> These are not traditional filesystem blocks - if you save a
>>>> file
>>>>>>>> smaller
>>>>>>>>>>> than 64MB, you don't lose 64MB of file space..
>>>>>>>>>>> 
>>>>>>>>>>> Hadoop will use 32KB to store a 32KB file (ok, plus a KB of
>>>>>> metadata
>>>>>>>> or
>>>>>>>>>>> so), not 64MB.
>>>>>>>>>>> 
>>>>>>>>>>> Brian
>>>>>>>>>>> 
>>>>>>>>>>> On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Hi,
>>>>>>>>>>>> I'm porting a legacy application to hadoop and it uses a
>>>> bunch
>>>>> of
>>>>>>>> small
>>>>>>>>>>>> files.
>>>>>>>>>>>> I'm aware that having such small files ain't a good idea but
>>>>> I'm
>>>>>>> not
>>>>>>>>>>> doing
>>>>>>>>>>>> the technical decisions and the port has to be done for
>>>>>>> yesterday...
>>>>>>>>>>>> Of course such small files are a problem, loading 64MB blocks
>>>>> for
>>>>>> a
>>>>>>>> few
>>>>>>>>>>>> lines of text is an evident loss.
>>>>>>>>>>>> What will happen if I set a smaller, or even way smaller
>>>> (32kB)
>>>>>>>> blocks?
>>>>>>>>>>>> 
>>>>>>>>>>>> Thank you.
>>>>>>>>>>>> 
>>>>>>>>>>>> Pierre ANCELOT.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> --
>>>>>>>>>> http://www.neko-consulting.com
>>>>>>>>>> Ego sum quis ego servo
>>>>>>>>>> "Je suis ce que je protège"
>>>>>>>>>> "I am what I protect"
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> http://www.neko-consulting.com
>>>>>>>> Ego sum quis ego servo
>>>>>>>> "Je suis ce que je protège"
>>>>>>>> "I am what I protect"
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> http://www.neko-consulting.com
>>>>>> Ego sum quis ego servo
>>>>>> "Je suis ce que je protège"
>>>>>> "I am what I protect"
>>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Best Wishes！
>>>> 顺送商祺！
>>>> 
>>>> －－
>>>> Chen He
>>>> (402)613-9298
>>>> PhD. student of CSE Dept.
>>>> Holland Computing Center
>>>> University of Nebraska-Lincoln
>>>> Lincoln NE 68588
>>>> 
>>

Re: Any possible to set hdfs block size to a value smaller than 64MB?

Posted by Konstantin Shvachko <sh...@yahoo-inc.com>.

You can also get some performance numbers and answers to the block size dilemma problem here:

http://developer.yahoo.net/blogs/hadoop/2010/05/scalability_of_the_hadoop_dist.html

I remember some people were using Hadoop for storing or streaming videos.
Don't know how well that worked.
It would be interesting to learn about your experience.

Thanks,
--Konstantin


On 5/18/2010 8:41 AM, Brian Bockelman wrote:
> Hey Hassan,
>
> 1) The overhead is pretty small, measured in a small number of milliseconds on average
> 2) HDFS is not designed for "online latency".  Even though the average is small, if something "bad happens", your clients might experience a lot of delays while going through the retry stack.  The initial design was for batch processing, and latency-sensitive applications came later.
>
> Additionally since the NN is a SPOF, you might want to consider your uptime requirements.  Each organization will have to balance these risks with the advantages (such as much cheaper hardware).
>
> There's a nice interview with the GFS authors here where they touch upon the latency issues:
>
> http://queue.acm.org/detail.cfm?id=1594206
>
> As GFS and HDFS share many design features, the theoretical parts of their discussion might be useful for you.
>
> As far as overall throughput of the system goes, it depends heavily upon your implementation and hardware.  Our HDFS routinely serves 5-10 Gbps.
>
> Brian
>
> On May 18, 2010, at 10:29 AM, Nyamul Hassan wrote:
>
>> This is a very interesting thread to us, as we are thinking about deploying
>> HDFS as a massive online storage for a on online university, and then
>> serving the video files to students who want to view them.
>>
>> We cannot control the size of the videos (and some class work files), as
>> they will mostly be uploaded by the teachers providing the classes.
>>
>> How would the overall through put of HDFS be affected in such a solution?
>> Would HDFS be feasible at all for such a setup?
>>
>> Regards
>> HASSAN
>>
>>
>>
>> On Tue, May 18, 2010 at 21:11, He Chen<ai...@gmail.com>  wrote:
>>
>>> If you know how to use AspectJ to do aspect oriented programming. You can
>>> write a aspect class. Let it just monitors the whole process of MapReduce
>>>
>>> On Tue, May 18, 2010 at 10:00 AM, Patrick Angeles<patrick@cloudera.com
>>>> wrote:
>>>
>>>> Should be evident in the total job running time... that's the only metric
>>>> that really matters :)
>>>>
>>>> On Tue, May 18, 2010 at 10:39 AM, Pierre ANCELOT<pierreact@gmail.com
>>>>> wrote:
>>>>
>>>>> Thank you,
>>>>> Any way I can measure the startup overhead in terms of time?
>>>>>
>>>>>
>>>>> On Tue, May 18, 2010 at 4:27 PM, Patrick Angeles<patrick@cloudera.com
>>>>>> wrote:
>>>>>
>>>>>> Pierre,
>>>>>>
>>>>>> Adding to what Brian has said (some things are not explicitly
>>> mentioned
>>>>> in
>>>>>> the HDFS design doc)...
>>>>>>
>>>>>> - If you have small files that take up<  64MB you do not actually use
>>>> the
>>>>>> entire 64MB block on disk.
>>>>>> - You *do* use up RAM on the NameNode, as each block represents
>>>> meta-data
>>>>>> that needs to be maintained in-memory in the NameNode.
>>>>>> - Hadoop won't perform optimally with very small block sizes. Hadoop
>>>> I/O
>>>>> is
>>>>>> optimized for high sustained throughput per single file/block. There
>>> is
>>>> a
>>>>>> penalty for doing too many seeks to get to the beginning of each
>>> block.
>>>>>> Additionally, you will have a MapReduce task per small file. Each
>>>>> MapReduce
>>>>>> task has a non-trivial startup overhead.
>>>>>> - The recommendation is to consolidate your small files into large
>>>> files.
>>>>>> One way to do this is via SequenceFiles... put the filename in the
>>>>>> SequenceFile key field, and the file's bytes in the SequenceFile
>>> value
>>>>>> field.
>>>>>>
>>>>>> In addition to the HDFS design docs, I recommend reading this blog
>>>> post:
>>>>>> http://www.cloudera.com/blog/2009/02/the-small-files-problem/
>>>>>>
>>>>>> Happy Hadooping,
>>>>>>
>>>>>> - Patrick
>>>>>>
>>>>>> On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT<pierreact@gmail.com
>>>>
>>>>>> wrote:
>>>>>>
>>>>>>> Okay, thank you :)
>>>>>>>
>>>>>>>
>>>>>>> On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman<
>>>> bbockelm@cse.unl.edu
>>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote:
>>>>>>>>
>>>>>>>>> Hi, thanks for this fast answer :)
>>>>>>>>> If so, what do you mean by blocks? If a file has to be
>>> splitted,
>>>> it
>>>>>>> will
>>>>>>>> be
>>>>>>>>> splitted when larger than 64MB?
>>>>>>>>>
>>>>>>>>
>>>>>>>> For every 64MB of the file, Hadoop will create a separate block.
>>>> So,
>>>>>> if
>>>>>>>> you have a 32KB file, there will be one block of 32KB.  If the
>>> file
>>>>> is
>>>>>>> 65MB,
>>>>>>>> then it will have one block of 64MB and another block of 1MB.
>>>>>>>>
>>>>>>>> Splitting files is very useful for load-balancing and
>>> distributing
>>>>> I/O
>>>>>>>> across multiple nodes.  At 32KB / file, you don't really need to
>>>>> split
>>>>>>> the
>>>>>>>> files at all.
>>>>>>>>
>>>>>>>> I recommend reading the HDFS design document for background
>>> issues
>>>>> like
>>>>>>>> this:
>>>>>>>>
>>>>>>>> http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html
>>>>>>>>
>>>>>>>> Brian
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman<
>>>>>> bbockelm@cse.unl.edu
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hey Pierre,
>>>>>>>>>>
>>>>>>>>>> These are not traditional filesystem blocks - if you save a
>>> file
>>>>>>> smaller
>>>>>>>>>> than 64MB, you don't lose 64MB of file space..
>>>>>>>>>>
>>>>>>>>>> Hadoop will use 32KB to store a 32KB file (ok, plus a KB of
>>>>> metadata
>>>>>>> or
>>>>>>>>>> so), not 64MB.
>>>>>>>>>>
>>>>>>>>>> Brian
>>>>>>>>>>
>>>>>>>>>> On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>> I'm porting a legacy application to hadoop and it uses a
>>> bunch
>>>> of
>>>>>>> small
>>>>>>>>>>> files.
>>>>>>>>>>> I'm aware that having such small files ain't a good idea but
>>>> I'm
>>>>>> not
>>>>>>>>>> doing
>>>>>>>>>>> the technical decisions and the port has to be done for
>>>>>> yesterday...
>>>>>>>>>>> Of course such small files are a problem, loading 64MB blocks
>>>> for
>>>>> a
>>>>>>> few
>>>>>>>>>>> lines of text is an evident loss.
>>>>>>>>>>> What will happen if I set a smaller, or even way smaller
>>> (32kB)
>>>>>>> blocks?
>>>>>>>>>>>
>>>>>>>>>>> Thank you.
>>>>>>>>>>>
>>>>>>>>>>> Pierre ANCELOT.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> http://www.neko-consulting.com
>>>>>>>>> Ego sum quis ego servo
>>>>>>>>> "Je suis ce que je protège"
>>>>>>>>> "I am what I protect"
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> http://www.neko-consulting.com
>>>>>>> Ego sum quis ego servo
>>>>>>> "Je suis ce que je protège"
>>>>>>> "I am what I protect"
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> http://www.neko-consulting.com
>>>>> Ego sum quis ego servo
>>>>> "Je suis ce que je protège"
>>>>> "I am what I protect"
>>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Best Wishes！
>>> 顺送商祺！
>>>
>>> －－
>>> Chen He
>>> (402)613-9298
>>> PhD. student of CSE Dept.
>>> Holland Computing Center
>>> University of Nebraska-Lincoln
>>> Lincoln NE 68588
>>>
>

Re: Any possible to set hdfs block size to a value smaller than 64MB?

Posted by Brian Bockelman <bb...@cse.unl.edu>.

Hey Hassan,

1) The overhead is pretty small, measured in a small number of milliseconds on average
2) HDFS is not designed for "online latency".  Even though the average is small, if something "bad happens", your clients might experience a lot of delays while going through the retry stack.  The initial design was for batch processing, and latency-sensitive applications came later.

Additionally since the NN is a SPOF, you might want to consider your uptime requirements.  Each organization will have to balance these risks with the advantages (such as much cheaper hardware).

There's a nice interview with the GFS authors here where they touch upon the latency issues:

http://queue.acm.org/detail.cfm?id=1594206

As GFS and HDFS share many design features, the theoretical parts of their discussion might be useful for you.

As far as overall throughput of the system goes, it depends heavily upon your implementation and hardware.  Our HDFS routinely serves 5-10 Gbps.

Brian

On May 18, 2010, at 10:29 AM, Nyamul Hassan wrote:

> This is a very interesting thread to us, as we are thinking about deploying
> HDFS as a massive online storage for a on online university, and then
> serving the video files to students who want to view them.
> 
> We cannot control the size of the videos (and some class work files), as
> they will mostly be uploaded by the teachers providing the classes.
> 
> How would the overall through put of HDFS be affected in such a solution?
> Would HDFS be feasible at all for such a setup?
> 
> Regards
> HASSAN
> 
> 
> 
> On Tue, May 18, 2010 at 21:11, He Chen <ai...@gmail.com> wrote:
> 
>> If you know how to use AspectJ to do aspect oriented programming. You can
>> write a aspect class. Let it just monitors the whole process of MapReduce
>> 
>> On Tue, May 18, 2010 at 10:00 AM, Patrick Angeles <patrick@cloudera.com
>>> wrote:
>> 
>>> Should be evident in the total job running time... that's the only metric
>>> that really matters :)
>>> 
>>> On Tue, May 18, 2010 at 10:39 AM, Pierre ANCELOT <pierreact@gmail.com
>>>> wrote:
>>> 
>>>> Thank you,
>>>> Any way I can measure the startup overhead in terms of time?
>>>> 
>>>> 
>>>> On Tue, May 18, 2010 at 4:27 PM, Patrick Angeles <patrick@cloudera.com
>>>>> wrote:
>>>> 
>>>>> Pierre,
>>>>> 
>>>>> Adding to what Brian has said (some things are not explicitly
>> mentioned
>>>> in
>>>>> the HDFS design doc)...
>>>>> 
>>>>> - If you have small files that take up < 64MB you do not actually use
>>> the
>>>>> entire 64MB block on disk.
>>>>> - You *do* use up RAM on the NameNode, as each block represents
>>> meta-data
>>>>> that needs to be maintained in-memory in the NameNode.
>>>>> - Hadoop won't perform optimally with very small block sizes. Hadoop
>>> I/O
>>>> is
>>>>> optimized for high sustained throughput per single file/block. There
>> is
>>> a
>>>>> penalty for doing too many seeks to get to the beginning of each
>> block.
>>>>> Additionally, you will have a MapReduce task per small file. Each
>>>> MapReduce
>>>>> task has a non-trivial startup overhead.
>>>>> - The recommendation is to consolidate your small files into large
>>> files.
>>>>> One way to do this is via SequenceFiles... put the filename in the
>>>>> SequenceFile key field, and the file's bytes in the SequenceFile
>> value
>>>>> field.
>>>>> 
>>>>> In addition to the HDFS design docs, I recommend reading this blog
>>> post:
>>>>> http://www.cloudera.com/blog/2009/02/the-small-files-problem/
>>>>> 
>>>>> Happy Hadooping,
>>>>> 
>>>>> - Patrick
>>>>> 
>>>>> On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT <pierreact@gmail.com
>>> 
>>>>> wrote:
>>>>> 
>>>>>> Okay, thank you :)
>>>>>> 
>>>>>> 
>>>>>> On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman <
>>> bbockelm@cse.unl.edu
>>>>>>> wrote:
>>>>>> 
>>>>>>> 
>>>>>>> On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote:
>>>>>>> 
>>>>>>>> Hi, thanks for this fast answer :)
>>>>>>>> If so, what do you mean by blocks? If a file has to be
>> splitted,
>>> it
>>>>>> will
>>>>>>> be
>>>>>>>> splitted when larger than 64MB?
>>>>>>>> 
>>>>>>> 
>>>>>>> For every 64MB of the file, Hadoop will create a separate block.
>>> So,
>>>>> if
>>>>>>> you have a 32KB file, there will be one block of 32KB.  If the
>> file
>>>> is
>>>>>> 65MB,
>>>>>>> then it will have one block of 64MB and another block of 1MB.
>>>>>>> 
>>>>>>> Splitting files is very useful for load-balancing and
>> distributing
>>>> I/O
>>>>>>> across multiple nodes.  At 32KB / file, you don't really need to
>>>> split
>>>>>> the
>>>>>>> files at all.
>>>>>>> 
>>>>>>> I recommend reading the HDFS design document for background
>> issues
>>>> like
>>>>>>> this:
>>>>>>> 
>>>>>>> http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html
>>>>>>> 
>>>>>>> Brian
>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman <
>>>>> bbockelm@cse.unl.edu
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Hey Pierre,
>>>>>>>>> 
>>>>>>>>> These are not traditional filesystem blocks - if you save a
>> file
>>>>>> smaller
>>>>>>>>> than 64MB, you don't lose 64MB of file space..
>>>>>>>>> 
>>>>>>>>> Hadoop will use 32KB to store a 32KB file (ok, plus a KB of
>>>> metadata
>>>>>> or
>>>>>>>>> so), not 64MB.
>>>>>>>>> 
>>>>>>>>> Brian
>>>>>>>>> 
>>>>>>>>> On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:
>>>>>>>>> 
>>>>>>>>>> Hi,
>>>>>>>>>> I'm porting a legacy application to hadoop and it uses a
>> bunch
>>> of
>>>>>> small
>>>>>>>>>> files.
>>>>>>>>>> I'm aware that having such small files ain't a good idea but
>>> I'm
>>>>> not
>>>>>>>>> doing
>>>>>>>>>> the technical decisions and the port has to be done for
>>>>> yesterday...
>>>>>>>>>> Of course such small files are a problem, loading 64MB blocks
>>> for
>>>> a
>>>>>> few
>>>>>>>>>> lines of text is an evident loss.
>>>>>>>>>> What will happen if I set a smaller, or even way smaller
>> (32kB)
>>>>>> blocks?
>>>>>>>>>> 
>>>>>>>>>> Thank you.
>>>>>>>>>> 
>>>>>>>>>> Pierre ANCELOT.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> http://www.neko-consulting.com
>>>>>>>> Ego sum quis ego servo
>>>>>>>> "Je suis ce que je protège"
>>>>>>>> "I am what I protect"
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> http://www.neko-consulting.com
>>>>>> Ego sum quis ego servo
>>>>>> "Je suis ce que je protège"
>>>>>> "I am what I protect"
>>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> http://www.neko-consulting.com
>>>> Ego sum quis ego servo
>>>> "Je suis ce que je protège"
>>>> "I am what I protect"
>>>> 
>>> 
>> 
>> 
>> 
>> --
>> Best Wishes！
>> 顺送商祺！
>> 
>> －－
>> Chen He
>> (402)613-9298
>> PhD. student of CSE Dept.
>> Holland Computing Center
>> University of Nebraska-Lincoln
>> Lincoln NE 68588
>>

Re: Any possible to set hdfs block size to a value smaller than 64MB?

Posted by Nyamul Hassan <mn...@usa.net>.

This is a very interesting thread to us, as we are thinking about deploying
HDFS as a massive online storage for a on online university, and then
serving the video files to students who want to view them.

We cannot control the size of the videos (and some class work files), as
they will mostly be uploaded by the teachers providing the classes.

How would the overall through put of HDFS be affected in such a solution?
 Would HDFS be feasible at all for such a setup?

Regards
HASSAN



On Tue, May 18, 2010 at 21:11, He Chen <ai...@gmail.com> wrote:

> If you know how to use AspectJ to do aspect oriented programming. You can
> write a aspect class. Let it just monitors the whole process of MapReduce
>
> On Tue, May 18, 2010 at 10:00 AM, Patrick Angeles <patrick@cloudera.com
> >wrote:
>
> > Should be evident in the total job running time... that's the only metric
> > that really matters :)
> >
> > On Tue, May 18, 2010 at 10:39 AM, Pierre ANCELOT <pierreact@gmail.com
> > >wrote:
> >
> > > Thank you,
> > > Any way I can measure the startup overhead in terms of time?
> > >
> > >
> > > On Tue, May 18, 2010 at 4:27 PM, Patrick Angeles <patrick@cloudera.com
> > > >wrote:
> > >
> > > > Pierre,
> > > >
> > > > Adding to what Brian has said (some things are not explicitly
> mentioned
> > > in
> > > > the HDFS design doc)...
> > > >
> > > > - If you have small files that take up < 64MB you do not actually use
> > the
> > > > entire 64MB block on disk.
> > > > - You *do* use up RAM on the NameNode, as each block represents
> > meta-data
> > > > that needs to be maintained in-memory in the NameNode.
> > > > - Hadoop won't perform optimally with very small block sizes. Hadoop
> > I/O
> > > is
> > > > optimized for high sustained throughput per single file/block. There
> is
> > a
> > > > penalty for doing too many seeks to get to the beginning of each
> block.
> > > > Additionally, you will have a MapReduce task per small file. Each
> > > MapReduce
> > > > task has a non-trivial startup overhead.
> > > > - The recommendation is to consolidate your small files into large
> > files.
> > > > One way to do this is via SequenceFiles... put the filename in the
> > > > SequenceFile key field, and the file's bytes in the SequenceFile
> value
> > > > field.
> > > >
> > > > In addition to the HDFS design docs, I recommend reading this blog
> > post:
> > > > http://www.cloudera.com/blog/2009/02/the-small-files-problem/
> > > >
> > > > Happy Hadooping,
> > > >
> > > > - Patrick
> > > >
> > > > On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT <pierreact@gmail.com
> >
> > > > wrote:
> > > >
> > > > > Okay, thank you :)
> > > > >
> > > > >
> > > > > On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman <
> > bbockelm@cse.unl.edu
> > > > > >wrote:
> > > > >
> > > > > >
> > > > > > On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote:
> > > > > >
> > > > > > > Hi, thanks for this fast answer :)
> > > > > > > If so, what do you mean by blocks? If a file has to be
> splitted,
> > it
> > > > > will
> > > > > > be
> > > > > > > splitted when larger than 64MB?
> > > > > > >
> > > > > >
> > > > > > For every 64MB of the file, Hadoop will create a separate block.
> >  So,
> > > > if
> > > > > > you have a 32KB file, there will be one block of 32KB.  If the
> file
> > > is
> > > > > 65MB,
> > > > > > then it will have one block of 64MB and another block of 1MB.
> > > > > >
> > > > > > Splitting files is very useful for load-balancing and
> distributing
> > > I/O
> > > > > > across multiple nodes.  At 32KB / file, you don't really need to
> > > split
> > > > > the
> > > > > > files at all.
> > > > > >
> > > > > > I recommend reading the HDFS design document for background
> issues
> > > like
> > > > > > this:
> > > > > >
> > > > > > http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html
> > > > > >
> > > > > > Brian
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman <
> > > > bbockelm@cse.unl.edu
> > > > > > >wrote:
> > > > > > >
> > > > > > >> Hey Pierre,
> > > > > > >>
> > > > > > >> These are not traditional filesystem blocks - if you save a
> file
> > > > > smaller
> > > > > > >> than 64MB, you don't lose 64MB of file space..
> > > > > > >>
> > > > > > >> Hadoop will use 32KB to store a 32KB file (ok, plus a KB of
> > > metadata
> > > > > or
> > > > > > >> so), not 64MB.
> > > > > > >>
> > > > > > >> Brian
> > > > > > >>
> > > > > > >> On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:
> > > > > > >>
> > > > > > >>> Hi,
> > > > > > >>> I'm porting a legacy application to hadoop and it uses a
> bunch
> > of
> > > > > small
> > > > > > >>> files.
> > > > > > >>> I'm aware that having such small files ain't a good idea but
> > I'm
> > > > not
> > > > > > >> doing
> > > > > > >>> the technical decisions and the port has to be done for
> > > > yesterday...
> > > > > > >>> Of course such small files are a problem, loading 64MB blocks
> > for
> > > a
> > > > > few
> > > > > > >>> lines of text is an evident loss.
> > > > > > >>> What will happen if I set a smaller, or even way smaller
> (32kB)
> > > > > blocks?
> > > > > > >>>
> > > > > > >>> Thank you.
> > > > > > >>>
> > > > > > >>> Pierre ANCELOT.
> > > > > > >>
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > http://www.neko-consulting.com
> > > > > > > Ego sum quis ego servo
> > > > > > > "Je suis ce que je protège"
> > > > > > > "I am what I protect"
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > http://www.neko-consulting.com
> > > > > Ego sum quis ego servo
> > > > > "Je suis ce que je protège"
> > > > > "I am what I protect"
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > http://www.neko-consulting.com
> > > Ego sum quis ego servo
> > > "Je suis ce que je protège"
> > > "I am what I protect"
> > >
> >
>
>
>
> --
> Best Wishes！
> 顺送商祺！
>
> －－
> Chen He
> (402)613-9298
> PhD. student of CSE Dept.
> Holland Computing Center
> University of Nebraska-Lincoln
> Lincoln NE 68588
>

Re: Any possible to set hdfs block size to a value smaller than 64MB?

Posted by He Chen <ai...@gmail.com>.

If you know how to use AspectJ to do aspect oriented programming. You can
write a aspect class. Let it just monitors the whole process of MapReduce

On Tue, May 18, 2010 at 10:00 AM, Patrick Angeles <pa...@cloudera.com>wrote:

> Should be evident in the total job running time... that's the only metric
> that really matters :)
>
> On Tue, May 18, 2010 at 10:39 AM, Pierre ANCELOT <pierreact@gmail.com
> >wrote:
>
> > Thank you,
> > Any way I can measure the startup overhead in terms of time?
> >
> >
> > On Tue, May 18, 2010 at 4:27 PM, Patrick Angeles <patrick@cloudera.com
> > >wrote:
> >
> > > Pierre,
> > >
> > > Adding to what Brian has said (some things are not explicitly mentioned
> > in
> > > the HDFS design doc)...
> > >
> > > - If you have small files that take up < 64MB you do not actually use
> the
> > > entire 64MB block on disk.
> > > - You *do* use up RAM on the NameNode, as each block represents
> meta-data
> > > that needs to be maintained in-memory in the NameNode.
> > > - Hadoop won't perform optimally with very small block sizes. Hadoop
> I/O
> > is
> > > optimized for high sustained throughput per single file/block. There is
> a
> > > penalty for doing too many seeks to get to the beginning of each block.
> > > Additionally, you will have a MapReduce task per small file. Each
> > MapReduce
> > > task has a non-trivial startup overhead.
> > > - The recommendation is to consolidate your small files into large
> files.
> > > One way to do this is via SequenceFiles... put the filename in the
> > > SequenceFile key field, and the file's bytes in the SequenceFile value
> > > field.
> > >
> > > In addition to the HDFS design docs, I recommend reading this blog
> post:
> > > http://www.cloudera.com/blog/2009/02/the-small-files-problem/
> > >
> > > Happy Hadooping,
> > >
> > > - Patrick
> > >
> > > On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT <pi...@gmail.com>
> > > wrote:
> > >
> > > > Okay, thank you :)
> > > >
> > > >
> > > > On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman <
> bbockelm@cse.unl.edu
> > > > >wrote:
> > > >
> > > > >
> > > > > On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote:
> > > > >
> > > > > > Hi, thanks for this fast answer :)
> > > > > > If so, what do you mean by blocks? If a file has to be splitted,
> it
> > > > will
> > > > > be
> > > > > > splitted when larger than 64MB?
> > > > > >
> > > > >
> > > > > For every 64MB of the file, Hadoop will create a separate block.
>  So,
> > > if
> > > > > you have a 32KB file, there will be one block of 32KB.  If the file
> > is
> > > > 65MB,
> > > > > then it will have one block of 64MB and another block of 1MB.
> > > > >
> > > > > Splitting files is very useful for load-balancing and distributing
> > I/O
> > > > > across multiple nodes.  At 32KB / file, you don't really need to
> > split
> > > > the
> > > > > files at all.
> > > > >
> > > > > I recommend reading the HDFS design document for background issues
> > like
> > > > > this:
> > > > >
> > > > > http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html
> > > > >
> > > > > Brian
> > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman <
> > > bbockelm@cse.unl.edu
> > > > > >wrote:
> > > > > >
> > > > > >> Hey Pierre,
> > > > > >>
> > > > > >> These are not traditional filesystem blocks - if you save a file
> > > > smaller
> > > > > >> than 64MB, you don't lose 64MB of file space..
> > > > > >>
> > > > > >> Hadoop will use 32KB to store a 32KB file (ok, plus a KB of
> > metadata
> > > > or
> > > > > >> so), not 64MB.
> > > > > >>
> > > > > >> Brian
> > > > > >>
> > > > > >> On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:
> > > > > >>
> > > > > >>> Hi,
> > > > > >>> I'm porting a legacy application to hadoop and it uses a bunch
> of
> > > > small
> > > > > >>> files.
> > > > > >>> I'm aware that having such small files ain't a good idea but
> I'm
> > > not
> > > > > >> doing
> > > > > >>> the technical decisions and the port has to be done for
> > > yesterday...
> > > > > >>> Of course such small files are a problem, loading 64MB blocks
> for
> > a
> > > > few
> > > > > >>> lines of text is an evident loss.
> > > > > >>> What will happen if I set a smaller, or even way smaller (32kB)
> > > > blocks?
> > > > > >>>
> > > > > >>> Thank you.
> > > > > >>>
> > > > > >>> Pierre ANCELOT.
> > > > > >>
> > > > > >>
> > > > > >
> > > > > >
> > > > > > --
> > > > > > http://www.neko-consulting.com
> > > > > > Ego sum quis ego servo
> > > > > > "Je suis ce que je protège"
> > > > > > "I am what I protect"
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > http://www.neko-consulting.com
> > > > Ego sum quis ego servo
> > > > "Je suis ce que je protège"
> > > > "I am what I protect"
> > > >
> > >
> >
> >
> >
> > --
> > http://www.neko-consulting.com
> > Ego sum quis ego servo
> > "Je suis ce que je protège"
> > "I am what I protect"
> >
>



-- 
Best Wishes！
顺送商祺！

－－
Chen He
(402)613-9298
PhD. student of CSE Dept.
Holland Computing Center
University of Nebraska-Lincoln
Lincoln NE 68588

Re: Any possible to set hdfs block size to a value smaller than 64MB?

Posted by Todd Lipcon <to...@cloudera.com>.

On Tue, May 18, 2010 at 2:50 PM, Jones, Nick <ni...@amd.com> wrote:

> I'm not familiar with how to use/create them, but shouldn't a HAR (Hadoop
> Archive) work well in this situation?  I thought it was designed to collect
> several small files together through another level indirection to avoid the
> NN load and decreasing the HDFS block size.
>
>
Yes, or CombineFileInputFormat. JVM reuse also helps somewhat, so long as
you're not talking about hundreds of thousands of files (in which case it
starts to hurt JT load with that many tasks in jobs)

There are a number of ways to combat the issue, but rule of thumb is that
you shouldn't try to use HDFS to store tons of small files :)

-Todd

-----Original Message-----
> From: patrickangeles@gmail.com [mailto:patrickangeles@gmail.com] On Behalf
> Of Patrick Angeles
> Sent: Tuesday, May 18, 2010 4:36 PM
> To: common-user@hadoop.apache.org
> Subject: Re: Any possible to set hdfs block size to a value smaller than
> 64MB?
>
> That wasn't sarcasm. This is what you do:
>
> - Run your mapreduce job on 30k small files.
> - Consolidate your 30k small files into larger files.
> - Run mapreduce ok the larger files.
> - Compare the running time
>
> The difference in runtime is made up by your task startup and seek
> overhead.
>
> If you want to get the 'average' overhead per task, divide the total times
> for each job by the number of map tasks. This won't be a true average
> because with larger chunks of data, you will have longer running map tasks
> that will hold up the shuffle phase. But the average doesn't really matter
> here because you always have that trade-off going from small to large
> chunks
> of data.
>
>
> On Tue, May 18, 2010 at 7:31 PM, Pierre ANCELOT <pi...@gmail.com>
> wrote:
>
> > Thanks for the sarcasm but with 30000 small files and so, 30000 Mapper
> > instatiations, even though it's not (and never did I say it was) he only
> > metric that matters, it seem to me lie something very interresting to
> check
> > out...
> > I have hierarchy over me and they will be happy to understand my choices
> > with real numbers to base their understanding on.
> > Thanks.
> >
> >
> > On Tue, May 18, 2010 at 5:00 PM, Patrick Angeles <patrick@cloudera.com
> > >wrote:
> >
> > > Should be evident in the total job running time... that's the only
> metric
> > > that really matters :)
> > >
> > > On Tue, May 18, 2010 at 10:39 AM, Pierre ANCELOT <pierreact@gmail.com
> > > >wrote:
> > >
> > > > Thank you,
> > > > Any way I can measure the startup overhead in terms of time?
> > > >
> > > >
> > > > On Tue, May 18, 2010 at 4:27 PM, Patrick Angeles <
> patrick@cloudera.com
> > > > >wrote:
> > > >
> > > > > Pierre,
> > > > >
> > > > > Adding to what Brian has said (some things are not explicitly
> > mentioned
> > > > in
> > > > > the HDFS design doc)...
> > > > >
> > > > > - If you have small files that take up < 64MB you do not actually
> use
> > > the
> > > > > entire 64MB block on disk.
> > > > > - You *do* use up RAM on the NameNode, as each block represents
> > > meta-data
> > > > > that needs to be maintained in-memory in the NameNode.
> > > > > - Hadoop won't perform optimally with very small block sizes.
> Hadoop
> > > I/O
> > > > is
> > > > > optimized for high sustained throughput per single file/block.
> There
> > is
> > > a
> > > > > penalty for doing too many seeks to get to the beginning of each
> > block.
> > > > > Additionally, you will have a MapReduce task per small file. Each
> > > > MapReduce
> > > > > task has a non-trivial startup overhead.
> > > > > - The recommendation is to consolidate your small files into large
> > > files.
> > > > > One way to do this is via SequenceFiles... put the filename in the
> > > > > SequenceFile key field, and the file's bytes in the SequenceFile
> > value
> > > > > field.
> > > > >
> > > > > In addition to the HDFS design docs, I recommend reading this blog
> > > post:
> > > > > http://www.cloudera.com/blog/2009/02/the-small-files-problem/
> > > > >
> > > > > Happy Hadooping,
> > > > >
> > > > > - Patrick
> > > > >
> > > > > On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT <
> pierreact@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > > > Okay, thank you :)
> > > > > >
> > > > > >
> > > > > > On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman <
> > > bbockelm@cse.unl.edu
> > > > > > >wrote:
> > > > > >
> > > > > > >
> > > > > > > On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote:
> > > > > > >
> > > > > > > > Hi, thanks for this fast answer :)
> > > > > > > > If so, what do you mean by blocks? If a file has to be
> > splitted,
> > > it
> > > > > > will
> > > > > > > be
> > > > > > > > splitted when larger than 64MB?
> > > > > > > >
> > > > > > >
> > > > > > > For every 64MB of the file, Hadoop will create a separate
> block.
> > >  So,
> > > > > if
> > > > > > > you have a 32KB file, there will be one block of 32KB.  If the
> > file
> > > > is
> > > > > > 65MB,
> > > > > > > then it will have one block of 64MB and another block of 1MB.
> > > > > > >
> > > > > > > Splitting files is very useful for load-balancing and
> > distributing
> > > > I/O
> > > > > > > across multiple nodes.  At 32KB / file, you don't really need
> to
> > > > split
> > > > > > the
> > > > > > > files at all.
> > > > > > >
> > > > > > > I recommend reading the HDFS design document for background
> > issues
> > > > like
> > > > > > > this:
> > > > > > >
> > > > > > > http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html
> > > > > > >
> > > > > > > Brian
> > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman <
> > > > > bbockelm@cse.unl.edu
> > > > > > > >wrote:
> > > > > > > >
> > > > > > > >> Hey Pierre,
> > > > > > > >>
> > > > > > > >> These are not traditional filesystem blocks - if you save a
> > file
> > > > > > smaller
> > > > > > > >> than 64MB, you don't lose 64MB of file space..
> > > > > > > >>
> > > > > > > >> Hadoop will use 32KB to store a 32KB file (ok, plus a KB of
> > > > metadata
> > > > > > or
> > > > > > > >> so), not 64MB.
> > > > > > > >>
> > > > > > > >> Brian
> > > > > > > >>
> > > > > > > >> On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:
> > > > > > > >>
> > > > > > > >>> Hi,
> > > > > > > >>> I'm porting a legacy application to hadoop and it uses a
> > bunch
> > > of
> > > > > > small
> > > > > > > >>> files.
> > > > > > > >>> I'm aware that having such small files ain't a good idea
> but
> > > I'm
> > > > > not
> > > > > > > >> doing
> > > > > > > >>> the technical decisions and the port has to be done for
> > > > > yesterday...
> > > > > > > >>> Of course such small files are a problem, loading 64MB
> blocks
> > > for
> > > > a
> > > > > > few
> > > > > > > >>> lines of text is an evident loss.
> > > > > > > >>> What will happen if I set a smaller, or even way smaller
> > (32kB)
> > > > > > blocks?
> > > > > > > >>>
> > > > > > > >>> Thank you.
> > > > > > > >>>
> > > > > > > >>> Pierre ANCELOT.
> > > > > > > >>
> > > > > > > >>
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > http://www.neko-consulting.com
> > > > > > > > Ego sum quis ego servo
> > > > > > > > "Je suis ce que je protège"
> > > > > > > > "I am what I protect"
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > http://www.neko-consulting.com
> > > > > > Ego sum quis ego servo
> > > > > > "Je suis ce que je protège"
> > > > > > "I am what I protect"
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > http://www.neko-consulting.com
> > > > Ego sum quis ego servo
> > > > "Je suis ce que je protège"
> > > > "I am what I protect"
> > > >
> > >
> >
> >
> >
> > --
> > http://www.neko-consulting.com
> > Ego sum quis ego servo
> > "Je suis ce que je protège"
> > "I am what I protect"
> >
>
>


-- 
Todd Lipcon
Software Engineer, Cloudera

RE: Any possible to set hdfs block size to a value smaller than 64MB?

Posted by "Jones, Nick" <ni...@amd.com>.

I'm not familiar with how to use/create them, but shouldn't a HAR (Hadoop Archive) work well in this situation?  I thought it was designed to collect several small files together through another level indirection to avoid the NN load and decreasing the HDFS block size.

Nick Jones
-----Original Message-----
From: patrickangeles@gmail.com [mailto:patrickangeles@gmail.com] On Behalf Of Patrick Angeles
Sent: Tuesday, May 18, 2010 4:36 PM
To: common-user@hadoop.apache.org
Subject: Re: Any possible to set hdfs block size to a value smaller than 64MB?

That wasn't sarcasm. This is what you do:

- Run your mapreduce job on 30k small files.
- Consolidate your 30k small files into larger files.
- Run mapreduce ok the larger files.
- Compare the running time

The difference in runtime is made up by your task startup and seek overhead.

If you want to get the 'average' overhead per task, divide the total times
for each job by the number of map tasks. This won't be a true average
because with larger chunks of data, you will have longer running map tasks
that will hold up the shuffle phase. But the average doesn't really matter
here because you always have that trade-off going from small to large chunks
of data.


On Tue, May 18, 2010 at 7:31 PM, Pierre ANCELOT <pi...@gmail.com> wrote:

> Thanks for the sarcasm but with 30000 small files and so, 30000 Mapper
> instatiations, even though it's not (and never did I say it was) he only
> metric that matters, it seem to me lie something very interresting to check
> out...
> I have hierarchy over me and they will be happy to understand my choices
> with real numbers to base their understanding on.
> Thanks.
>
>
> On Tue, May 18, 2010 at 5:00 PM, Patrick Angeles <patrick@cloudera.com
> >wrote:
>
> > Should be evident in the total job running time... that's the only metric
> > that really matters :)
> >
> > On Tue, May 18, 2010 at 10:39 AM, Pierre ANCELOT <pierreact@gmail.com
> > >wrote:
> >
> > > Thank you,
> > > Any way I can measure the startup overhead in terms of time?
> > >
> > >
> > > On Tue, May 18, 2010 at 4:27 PM, Patrick Angeles <patrick@cloudera.com
> > > >wrote:
> > >
> > > > Pierre,
> > > >
> > > > Adding to what Brian has said (some things are not explicitly
> mentioned
> > > in
> > > > the HDFS design doc)...
> > > >
> > > > - If you have small files that take up < 64MB you do not actually use
> > the
> > > > entire 64MB block on disk.
> > > > - You *do* use up RAM on the NameNode, as each block represents
> > meta-data
> > > > that needs to be maintained in-memory in the NameNode.
> > > > - Hadoop won't perform optimally with very small block sizes. Hadoop
> > I/O
> > > is
> > > > optimized for high sustained throughput per single file/block. There
> is
> > a
> > > > penalty for doing too many seeks to get to the beginning of each
> block.
> > > > Additionally, you will have a MapReduce task per small file. Each
> > > MapReduce
> > > > task has a non-trivial startup overhead.
> > > > - The recommendation is to consolidate your small files into large
> > files.
> > > > One way to do this is via SequenceFiles... put the filename in the
> > > > SequenceFile key field, and the file's bytes in the SequenceFile
> value
> > > > field.
> > > >
> > > > In addition to the HDFS design docs, I recommend reading this blog
> > post:
> > > > http://www.cloudera.com/blog/2009/02/the-small-files-problem/
> > > >
> > > > Happy Hadooping,
> > > >
> > > > - Patrick
> > > >
> > > > On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT <pierreact@gmail.com
> >
> > > > wrote:
> > > >
> > > > > Okay, thank you :)
> > > > >
> > > > >
> > > > > On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman <
> > bbockelm@cse.unl.edu
> > > > > >wrote:
> > > > >
> > > > > >
> > > > > > On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote:
> > > > > >
> > > > > > > Hi, thanks for this fast answer :)
> > > > > > > If so, what do you mean by blocks? If a file has to be
> splitted,
> > it
> > > > > will
> > > > > > be
> > > > > > > splitted when larger than 64MB?
> > > > > > >
> > > > > >
> > > > > > For every 64MB of the file, Hadoop will create a separate block.
> >  So,
> > > > if
> > > > > > you have a 32KB file, there will be one block of 32KB.  If the
> file
> > > is
> > > > > 65MB,
> > > > > > then it will have one block of 64MB and another block of 1MB.
> > > > > >
> > > > > > Splitting files is very useful for load-balancing and
> distributing
> > > I/O
> > > > > > across multiple nodes.  At 32KB / file, you don't really need to
> > > split
> > > > > the
> > > > > > files at all.
> > > > > >
> > > > > > I recommend reading the HDFS design document for background
> issues
> > > like
> > > > > > this:
> > > > > >
> > > > > > http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html
> > > > > >
> > > > > > Brian
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman <
> > > > bbockelm@cse.unl.edu
> > > > > > >wrote:
> > > > > > >
> > > > > > >> Hey Pierre,
> > > > > > >>
> > > > > > >> These are not traditional filesystem blocks - if you save a
> file
> > > > > smaller
> > > > > > >> than 64MB, you don't lose 64MB of file space..
> > > > > > >>
> > > > > > >> Hadoop will use 32KB to store a 32KB file (ok, plus a KB of
> > > metadata
> > > > > or
> > > > > > >> so), not 64MB.
> > > > > > >>
> > > > > > >> Brian
> > > > > > >>
> > > > > > >> On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:
> > > > > > >>
> > > > > > >>> Hi,
> > > > > > >>> I'm porting a legacy application to hadoop and it uses a
> bunch
> > of
> > > > > small
> > > > > > >>> files.
> > > > > > >>> I'm aware that having such small files ain't a good idea but
> > I'm
> > > > not
> > > > > > >> doing
> > > > > > >>> the technical decisions and the port has to be done for
> > > > yesterday...
> > > > > > >>> Of course such small files are a problem, loading 64MB blocks
> > for
> > > a
> > > > > few
> > > > > > >>> lines of text is an evident loss.
> > > > > > >>> What will happen if I set a smaller, or even way smaller
> (32kB)
> > > > > blocks?
> > > > > > >>>
> > > > > > >>> Thank you.
> > > > > > >>>
> > > > > > >>> Pierre ANCELOT.
> > > > > > >>
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > http://www.neko-consulting.com
> > > > > > > Ego sum quis ego servo
> > > > > > > "Je suis ce que je protège"
> > > > > > > "I am what I protect"
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > http://www.neko-consulting.com
> > > > > Ego sum quis ego servo
> > > > > "Je suis ce que je protège"
> > > > > "I am what I protect"
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > http://www.neko-consulting.com
> > > Ego sum quis ego servo
> > > "Je suis ce que je protège"
> > > "I am what I protect"
> > >
> >
>
>
>
> --
> http://www.neko-consulting.com
> Ego sum quis ego servo
> "Je suis ce que je protège"
> "I am what I protect"
>

Re: Any possible to set hdfs block size to a value smaller than 64MB?

Posted by Pierre ANCELOT <pi...@gmail.com>.

Okay, sorry then, I misunderstood.
I think I could aswell run it on empty files, I would only get task startup
overhead.
Thank you.

On Tue, May 18, 2010 at 11:36 PM, Patrick Angeles <pa...@cloudera.com>wrote:

> That wasn't sarcasm. This is what you do:
>
> - Run your mapreduce job on 30k small files.
> - Consolidate your 30k small files into larger files.
> - Run mapreduce ok the larger files.
> - Compare the running time
>
> The difference in runtime is made up by your task startup and seek
> overhead.
>
> If you want to get the 'average' overhead per task, divide the total times
> for each job by the number of map tasks. This won't be a true average
> because with larger chunks of data, you will have longer running map tasks
> that will hold up the shuffle phase. But the average doesn't really matter
> here because you always have that trade-off going from small to large
> chunks
> of data.
>
>
> On Tue, May 18, 2010 at 7:31 PM, Pierre ANCELOT <pi...@gmail.com>
> wrote:
>
> > Thanks for the sarcasm but with 30000 small files and so, 30000 Mapper
> > instatiations, even though it's not (and never did I say it was) he only
> > metric that matters, it seem to me lie something very interresting to
> check
> > out...
> > I have hierarchy over me and they will be happy to understand my choices
> > with real numbers to base their understanding on.
> > Thanks.
> >
> >
> > On Tue, May 18, 2010 at 5:00 PM, Patrick Angeles <patrick@cloudera.com
> > >wrote:
> >
> > > Should be evident in the total job running time... that's the only
> metric
> > > that really matters :)
> > >
> > > On Tue, May 18, 2010 at 10:39 AM, Pierre ANCELOT <pierreact@gmail.com
> > > >wrote:
> > >
> > > > Thank you,
> > > > Any way I can measure the startup overhead in terms of time?
> > > >
> > > >
> > > > On Tue, May 18, 2010 at 4:27 PM, Patrick Angeles <
> patrick@cloudera.com
> > > > >wrote:
> > > >
> > > > > Pierre,
> > > > >
> > > > > Adding to what Brian has said (some things are not explicitly
> > mentioned
> > > > in
> > > > > the HDFS design doc)...
> > > > >
> > > > > - If you have small files that take up < 64MB you do not actually
> use
> > > the
> > > > > entire 64MB block on disk.
> > > > > - You *do* use up RAM on the NameNode, as each block represents
> > > meta-data
> > > > > that needs to be maintained in-memory in the NameNode.
> > > > > - Hadoop won't perform optimally with very small block sizes.
> Hadoop
> > > I/O
> > > > is
> > > > > optimized for high sustained throughput per single file/block.
> There
> > is
> > > a
> > > > > penalty for doing too many seeks to get to the beginning of each
> > block.
> > > > > Additionally, you will have a MapReduce task per small file. Each
> > > > MapReduce
> > > > > task has a non-trivial startup overhead.
> > > > > - The recommendation is to consolidate your small files into large
> > > files.
> > > > > One way to do this is via SequenceFiles... put the filename in the
> > > > > SequenceFile key field, and the file's bytes in the SequenceFile
> > value
> > > > > field.
> > > > >
> > > > > In addition to the HDFS design docs, I recommend reading this blog
> > > post:
> > > > > http://www.cloudera.com/blog/2009/02/the-small-files-problem/
> > > > >
> > > > > Happy Hadooping,
> > > > >
> > > > > - Patrick
> > > > >
> > > > > On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT <
> pierreact@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > > > Okay, thank you :)
> > > > > >
> > > > > >
> > > > > > On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman <
> > > bbockelm@cse.unl.edu
> > > > > > >wrote:
> > > > > >
> > > > > > >
> > > > > > > On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote:
> > > > > > >
> > > > > > > > Hi, thanks for this fast answer :)
> > > > > > > > If so, what do you mean by blocks? If a file has to be
> > splitted,
> > > it
> > > > > > will
> > > > > > > be
> > > > > > > > splitted when larger than 64MB?
> > > > > > > >
> > > > > > >
> > > > > > > For every 64MB of the file, Hadoop will create a separate
> block.
> > >  So,
> > > > > if
> > > > > > > you have a 32KB file, there will be one block of 32KB.  If the
> > file
> > > > is
> > > > > > 65MB,
> > > > > > > then it will have one block of 64MB and another block of 1MB.
> > > > > > >
> > > > > > > Splitting files is very useful for load-balancing and
> > distributing
> > > > I/O
> > > > > > > across multiple nodes.  At 32KB / file, you don't really need
> to
> > > > split
> > > > > > the
> > > > > > > files at all.
> > > > > > >
> > > > > > > I recommend reading the HDFS design document for background
> > issues
> > > > like
> > > > > > > this:
> > > > > > >
> > > > > > > http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html
> > > > > > >
> > > > > > > Brian
> > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman <
> > > > > bbockelm@cse.unl.edu
> > > > > > > >wrote:
> > > > > > > >
> > > > > > > >> Hey Pierre,
> > > > > > > >>
> > > > > > > >> These are not traditional filesystem blocks - if you save a
> > file
> > > > > > smaller
> > > > > > > >> than 64MB, you don't lose 64MB of file space..
> > > > > > > >>
> > > > > > > >> Hadoop will use 32KB to store a 32KB file (ok, plus a KB of
> > > > metadata
> > > > > > or
> > > > > > > >> so), not 64MB.
> > > > > > > >>
> > > > > > > >> Brian
> > > > > > > >>
> > > > > > > >> On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:
> > > > > > > >>
> > > > > > > >>> Hi,
> > > > > > > >>> I'm porting a legacy application to hadoop and it uses a
> > bunch
> > > of
> > > > > > small
> > > > > > > >>> files.
> > > > > > > >>> I'm aware that having such small files ain't a good idea
> but
> > > I'm
> > > > > not
> > > > > > > >> doing
> > > > > > > >>> the technical decisions and the port has to be done for
> > > > > yesterday...
> > > > > > > >>> Of course such small files are a problem, loading 64MB
> blocks
> > > for
> > > > a
> > > > > > few
> > > > > > > >>> lines of text is an evident loss.
> > > > > > > >>> What will happen if I set a smaller, or even way smaller
> > (32kB)
> > > > > > blocks?
> > > > > > > >>>
> > > > > > > >>> Thank you.
> > > > > > > >>>
> > > > > > > >>> Pierre ANCELOT.
> > > > > > > >>
> > > > > > > >>
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > http://www.neko-consulting.com
> > > > > > > > Ego sum quis ego servo
> > > > > > > > "Je suis ce que je protège"
> > > > > > > > "I am what I protect"
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > http://www.neko-consulting.com
> > > > > > Ego sum quis ego servo
> > > > > > "Je suis ce que je protège"
> > > > > > "I am what I protect"
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > http://www.neko-consulting.com
> > > > Ego sum quis ego servo
> > > > "Je suis ce que je protège"
> > > > "I am what I protect"
> > > >
> > >
> >
> >
> >
> > --
> > http://www.neko-consulting.com
> > Ego sum quis ego servo
> > "Je suis ce que je protège"
> > "I am what I protect"
> >
>



-- 
http://www.neko-consulting.com
Ego sum quis ego servo
"Je suis ce que je protège"
"I am what I protect"

Re: Any possible to set hdfs block size to a value smaller than 64MB?

Posted by Patrick Angeles <pa...@cloudera.com>.

That wasn't sarcasm. This is what you do:

- Run your mapreduce job on 30k small files.
- Consolidate your 30k small files into larger files.
- Run mapreduce ok the larger files.
- Compare the running time

The difference in runtime is made up by your task startup and seek overhead.

If you want to get the 'average' overhead per task, divide the total times
for each job by the number of map tasks. This won't be a true average
because with larger chunks of data, you will have longer running map tasks
that will hold up the shuffle phase. But the average doesn't really matter
here because you always have that trade-off going from small to large chunks
of data.


On Tue, May 18, 2010 at 7:31 PM, Pierre ANCELOT <pi...@gmail.com> wrote:

> Thanks for the sarcasm but with 30000 small files and so, 30000 Mapper
> instatiations, even though it's not (and never did I say it was) he only
> metric that matters, it seem to me lie something very interresting to check
> out...
> I have hierarchy over me and they will be happy to understand my choices
> with real numbers to base their understanding on.
> Thanks.
>
>
> On Tue, May 18, 2010 at 5:00 PM, Patrick Angeles <patrick@cloudera.com
> >wrote:
>
> > Should be evident in the total job running time... that's the only metric
> > that really matters :)
> >
> > On Tue, May 18, 2010 at 10:39 AM, Pierre ANCELOT <pierreact@gmail.com
> > >wrote:
> >
> > > Thank you,
> > > Any way I can measure the startup overhead in terms of time?
> > >
> > >
> > > On Tue, May 18, 2010 at 4:27 PM, Patrick Angeles <patrick@cloudera.com
> > > >wrote:
> > >
> > > > Pierre,
> > > >
> > > > Adding to what Brian has said (some things are not explicitly
> mentioned
> > > in
> > > > the HDFS design doc)...
> > > >
> > > > - If you have small files that take up < 64MB you do not actually use
> > the
> > > > entire 64MB block on disk.
> > > > - You *do* use up RAM on the NameNode, as each block represents
> > meta-data
> > > > that needs to be maintained in-memory in the NameNode.
> > > > - Hadoop won't perform optimally with very small block sizes. Hadoop
> > I/O
> > > is
> > > > optimized for high sustained throughput per single file/block. There
> is
> > a
> > > > penalty for doing too many seeks to get to the beginning of each
> block.
> > > > Additionally, you will have a MapReduce task per small file. Each
> > > MapReduce
> > > > task has a non-trivial startup overhead.
> > > > - The recommendation is to consolidate your small files into large
> > files.
> > > > One way to do this is via SequenceFiles... put the filename in the
> > > > SequenceFile key field, and the file's bytes in the SequenceFile
> value
> > > > field.
> > > >
> > > > In addition to the HDFS design docs, I recommend reading this blog
> > post:
> > > > http://www.cloudera.com/blog/2009/02/the-small-files-problem/
> > > >
> > > > Happy Hadooping,
> > > >
> > > > - Patrick
> > > >
> > > > On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT <pierreact@gmail.com
> >
> > > > wrote:
> > > >
> > > > > Okay, thank you :)
> > > > >
> > > > >
> > > > > On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman <
> > bbockelm@cse.unl.edu
> > > > > >wrote:
> > > > >
> > > > > >
> > > > > > On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote:
> > > > > >
> > > > > > > Hi, thanks for this fast answer :)
> > > > > > > If so, what do you mean by blocks? If a file has to be
> splitted,
> > it
> > > > > will
> > > > > > be
> > > > > > > splitted when larger than 64MB?
> > > > > > >
> > > > > >
> > > > > > For every 64MB of the file, Hadoop will create a separate block.
> >  So,
> > > > if
> > > > > > you have a 32KB file, there will be one block of 32KB.  If the
> file
> > > is
> > > > > 65MB,
> > > > > > then it will have one block of 64MB and another block of 1MB.
> > > > > >
> > > > > > Splitting files is very useful for load-balancing and
> distributing
> > > I/O
> > > > > > across multiple nodes.  At 32KB / file, you don't really need to
> > > split
> > > > > the
> > > > > > files at all.
> > > > > >
> > > > > > I recommend reading the HDFS design document for background
> issues
> > > like
> > > > > > this:
> > > > > >
> > > > > > http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html
> > > > > >
> > > > > > Brian
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman <
> > > > bbockelm@cse.unl.edu
> > > > > > >wrote:
> > > > > > >
> > > > > > >> Hey Pierre,
> > > > > > >>
> > > > > > >> These are not traditional filesystem blocks - if you save a
> file
> > > > > smaller
> > > > > > >> than 64MB, you don't lose 64MB of file space..
> > > > > > >>
> > > > > > >> Hadoop will use 32KB to store a 32KB file (ok, plus a KB of
> > > metadata
> > > > > or
> > > > > > >> so), not 64MB.
> > > > > > >>
> > > > > > >> Brian
> > > > > > >>
> > > > > > >> On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:
> > > > > > >>
> > > > > > >>> Hi,
> > > > > > >>> I'm porting a legacy application to hadoop and it uses a
> bunch
> > of
> > > > > small
> > > > > > >>> files.
> > > > > > >>> I'm aware that having such small files ain't a good idea but
> > I'm
> > > > not
> > > > > > >> doing
> > > > > > >>> the technical decisions and the port has to be done for
> > > > yesterday...
> > > > > > >>> Of course such small files are a problem, loading 64MB blocks
> > for
> > > a
> > > > > few
> > > > > > >>> lines of text is an evident loss.
> > > > > > >>> What will happen if I set a smaller, or even way smaller
> (32kB)
> > > > > blocks?
> > > > > > >>>
> > > > > > >>> Thank you.
> > > > > > >>>
> > > > > > >>> Pierre ANCELOT.
> > > > > > >>
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > http://www.neko-consulting.com
> > > > > > > Ego sum quis ego servo
> > > > > > > "Je suis ce que je protège"
> > > > > > > "I am what I protect"
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > http://www.neko-consulting.com
> > > > > Ego sum quis ego servo
> > > > > "Je suis ce que je protège"
> > > > > "I am what I protect"
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > http://www.neko-consulting.com
> > > Ego sum quis ego servo
> > > "Je suis ce que je protège"
> > > "I am what I protect"
> > >
> >
>
>
>
> --
> http://www.neko-consulting.com
> Ego sum quis ego servo
> "Je suis ce que je protège"
> "I am what I protect"
>

Re: Any possible to set hdfs block size to a value smaller than 64MB?

Posted by Pierre ANCELOT <pi...@gmail.com>.

Thanks for the sarcasm but with 30000 small files and so, 30000 Mapper
instatiations, even though it's not (and never did I say it was) he only
metric that matters, it seem to me lie something very interresting to check
out...
I have hierarchy over me and they will be happy to understand my choices
with real numbers to base their understanding on.
Thanks.


On Tue, May 18, 2010 at 5:00 PM, Patrick Angeles <pa...@cloudera.com>wrote:

> Should be evident in the total job running time... that's the only metric
> that really matters :)
>
> On Tue, May 18, 2010 at 10:39 AM, Pierre ANCELOT <pierreact@gmail.com
> >wrote:
>
> > Thank you,
> > Any way I can measure the startup overhead in terms of time?
> >
> >
> > On Tue, May 18, 2010 at 4:27 PM, Patrick Angeles <patrick@cloudera.com
> > >wrote:
> >
> > > Pierre,
> > >
> > > Adding to what Brian has said (some things are not explicitly mentioned
> > in
> > > the HDFS design doc)...
> > >
> > > - If you have small files that take up < 64MB you do not actually use
> the
> > > entire 64MB block on disk.
> > > - You *do* use up RAM on the NameNode, as each block represents
> meta-data
> > > that needs to be maintained in-memory in the NameNode.
> > > - Hadoop won't perform optimally with very small block sizes. Hadoop
> I/O
> > is
> > > optimized for high sustained throughput per single file/block. There is
> a
> > > penalty for doing too many seeks to get to the beginning of each block.
> > > Additionally, you will have a MapReduce task per small file. Each
> > MapReduce
> > > task has a non-trivial startup overhead.
> > > - The recommendation is to consolidate your small files into large
> files.
> > > One way to do this is via SequenceFiles... put the filename in the
> > > SequenceFile key field, and the file's bytes in the SequenceFile value
> > > field.
> > >
> > > In addition to the HDFS design docs, I recommend reading this blog
> post:
> > > http://www.cloudera.com/blog/2009/02/the-small-files-problem/
> > >
> > > Happy Hadooping,
> > >
> > > - Patrick
> > >
> > > On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT <pi...@gmail.com>
> > > wrote:
> > >
> > > > Okay, thank you :)
> > > >
> > > >
> > > > On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman <
> bbockelm@cse.unl.edu
> > > > >wrote:
> > > >
> > > > >
> > > > > On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote:
> > > > >
> > > > > > Hi, thanks for this fast answer :)
> > > > > > If so, what do you mean by blocks? If a file has to be splitted,
> it
> > > > will
> > > > > be
> > > > > > splitted when larger than 64MB?
> > > > > >
> > > > >
> > > > > For every 64MB of the file, Hadoop will create a separate block.
>  So,
> > > if
> > > > > you have a 32KB file, there will be one block of 32KB.  If the file
> > is
> > > > 65MB,
> > > > > then it will have one block of 64MB and another block of 1MB.
> > > > >
> > > > > Splitting files is very useful for load-balancing and distributing
> > I/O
> > > > > across multiple nodes.  At 32KB / file, you don't really need to
> > split
> > > > the
> > > > > files at all.
> > > > >
> > > > > I recommend reading the HDFS design document for background issues
> > like
> > > > > this:
> > > > >
> > > > > http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html
> > > > >
> > > > > Brian
> > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman <
> > > bbockelm@cse.unl.edu
> > > > > >wrote:
> > > > > >
> > > > > >> Hey Pierre,
> > > > > >>
> > > > > >> These are not traditional filesystem blocks - if you save a file
> > > > smaller
> > > > > >> than 64MB, you don't lose 64MB of file space..
> > > > > >>
> > > > > >> Hadoop will use 32KB to store a 32KB file (ok, plus a KB of
> > metadata
> > > > or
> > > > > >> so), not 64MB.
> > > > > >>
> > > > > >> Brian
> > > > > >>
> > > > > >> On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:
> > > > > >>
> > > > > >>> Hi,
> > > > > >>> I'm porting a legacy application to hadoop and it uses a bunch
> of
> > > > small
> > > > > >>> files.
> > > > > >>> I'm aware that having such small files ain't a good idea but
> I'm
> > > not
> > > > > >> doing
> > > > > >>> the technical decisions and the port has to be done for
> > > yesterday...
> > > > > >>> Of course such small files are a problem, loading 64MB blocks
> for
> > a
> > > > few
> > > > > >>> lines of text is an evident loss.
> > > > > >>> What will happen if I set a smaller, or even way smaller (32kB)
> > > > blocks?
> > > > > >>>
> > > > > >>> Thank you.
> > > > > >>>
> > > > > >>> Pierre ANCELOT.
> > > > > >>
> > > > > >>
> > > > > >
> > > > > >
> > > > > > --
> > > > > > http://www.neko-consulting.com
> > > > > > Ego sum quis ego servo
> > > > > > "Je suis ce que je protège"
> > > > > > "I am what I protect"
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > http://www.neko-consulting.com
> > > > Ego sum quis ego servo
> > > > "Je suis ce que je protège"
> > > > "I am what I protect"
> > > >
> > >
> >
> >
> >
> > --
> > http://www.neko-consulting.com
> > Ego sum quis ego servo
> > "Je suis ce que je protège"
> > "I am what I protect"
> >
>



-- 
http://www.neko-consulting.com
Ego sum quis ego servo
"Je suis ce que je protège"
"I am what I protect"

Re: Any possible to set hdfs block size to a value smaller than 64MB?

Posted by Patrick Angeles <pa...@cloudera.com>.

Should be evident in the total job running time... that's the only metric
that really matters :)

On Tue, May 18, 2010 at 10:39 AM, Pierre ANCELOT <pi...@gmail.com>wrote:

> Thank you,
> Any way I can measure the startup overhead in terms of time?
>
>
> On Tue, May 18, 2010 at 4:27 PM, Patrick Angeles <patrick@cloudera.com
> >wrote:
>
> > Pierre,
> >
> > Adding to what Brian has said (some things are not explicitly mentioned
> in
> > the HDFS design doc)...
> >
> > - If you have small files that take up < 64MB you do not actually use the
> > entire 64MB block on disk.
> > - You *do* use up RAM on the NameNode, as each block represents meta-data
> > that needs to be maintained in-memory in the NameNode.
> > - Hadoop won't perform optimally with very small block sizes. Hadoop I/O
> is
> > optimized for high sustained throughput per single file/block. There is a
> > penalty for doing too many seeks to get to the beginning of each block.
> > Additionally, you will have a MapReduce task per small file. Each
> MapReduce
> > task has a non-trivial startup overhead.
> > - The recommendation is to consolidate your small files into large files.
> > One way to do this is via SequenceFiles... put the filename in the
> > SequenceFile key field, and the file's bytes in the SequenceFile value
> > field.
> >
> > In addition to the HDFS design docs, I recommend reading this blog post:
> > http://www.cloudera.com/blog/2009/02/the-small-files-problem/
> >
> > Happy Hadooping,
> >
> > - Patrick
> >
> > On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT <pi...@gmail.com>
> > wrote:
> >
> > > Okay, thank you :)
> > >
> > >
> > > On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman <bbockelm@cse.unl.edu
> > > >wrote:
> > >
> > > >
> > > > On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote:
> > > >
> > > > > Hi, thanks for this fast answer :)
> > > > > If so, what do you mean by blocks? If a file has to be splitted, it
> > > will
> > > > be
> > > > > splitted when larger than 64MB?
> > > > >
> > > >
> > > > For every 64MB of the file, Hadoop will create a separate block.  So,
> > if
> > > > you have a 32KB file, there will be one block of 32KB.  If the file
> is
> > > 65MB,
> > > > then it will have one block of 64MB and another block of 1MB.
> > > >
> > > > Splitting files is very useful for load-balancing and distributing
> I/O
> > > > across multiple nodes.  At 32KB / file, you don't really need to
> split
> > > the
> > > > files at all.
> > > >
> > > > I recommend reading the HDFS design document for background issues
> like
> > > > this:
> > > >
> > > > http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html
> > > >
> > > > Brian
> > > >
> > > > >
> > > > >
> > > > >
> > > > > On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman <
> > bbockelm@cse.unl.edu
> > > > >wrote:
> > > > >
> > > > >> Hey Pierre,
> > > > >>
> > > > >> These are not traditional filesystem blocks - if you save a file
> > > smaller
> > > > >> than 64MB, you don't lose 64MB of file space..
> > > > >>
> > > > >> Hadoop will use 32KB to store a 32KB file (ok, plus a KB of
> metadata
> > > or
> > > > >> so), not 64MB.
> > > > >>
> > > > >> Brian
> > > > >>
> > > > >> On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:
> > > > >>
> > > > >>> Hi,
> > > > >>> I'm porting a legacy application to hadoop and it uses a bunch of
> > > small
> > > > >>> files.
> > > > >>> I'm aware that having such small files ain't a good idea but I'm
> > not
> > > > >> doing
> > > > >>> the technical decisions and the port has to be done for
> > yesterday...
> > > > >>> Of course such small files are a problem, loading 64MB blocks for
> a
> > > few
> > > > >>> lines of text is an evident loss.
> > > > >>> What will happen if I set a smaller, or even way smaller (32kB)
> > > blocks?
> > > > >>>
> > > > >>> Thank you.
> > > > >>>
> > > > >>> Pierre ANCELOT.
> > > > >>
> > > > >>
> > > > >
> > > > >
> > > > > --
> > > > > http://www.neko-consulting.com
> > > > > Ego sum quis ego servo
> > > > > "Je suis ce que je protège"
> > > > > "I am what I protect"
> > > >
> > > >
> > >
> > >
> > > --
> > > http://www.neko-consulting.com
> > > Ego sum quis ego servo
> > > "Je suis ce que je protège"
> > > "I am what I protect"
> > >
> >
>
>
>
> --
> http://www.neko-consulting.com
> Ego sum quis ego servo
> "Je suis ce que je protège"
> "I am what I protect"
>

Re: Any possible to set hdfs block size to a value smaller than 64MB?

Posted by Pierre ANCELOT <pi...@gmail.com>.

Thank you,
Any way I can measure the startup overhead in terms of time?


On Tue, May 18, 2010 at 4:27 PM, Patrick Angeles <pa...@cloudera.com>wrote:

> Pierre,
>
> Adding to what Brian has said (some things are not explicitly mentioned in
> the HDFS design doc)...
>
> - If you have small files that take up < 64MB you do not actually use the
> entire 64MB block on disk.
> - You *do* use up RAM on the NameNode, as each block represents meta-data
> that needs to be maintained in-memory in the NameNode.
> - Hadoop won't perform optimally with very small block sizes. Hadoop I/O is
> optimized for high sustained throughput per single file/block. There is a
> penalty for doing too many seeks to get to the beginning of each block.
> Additionally, you will have a MapReduce task per small file. Each MapReduce
> task has a non-trivial startup overhead.
> - The recommendation is to consolidate your small files into large files.
> One way to do this is via SequenceFiles... put the filename in the
> SequenceFile key field, and the file's bytes in the SequenceFile value
> field.
>
> In addition to the HDFS design docs, I recommend reading this blog post:
> http://www.cloudera.com/blog/2009/02/the-small-files-problem/
>
> Happy Hadooping,
>
> - Patrick
>
> On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT <pi...@gmail.com>
> wrote:
>
> > Okay, thank you :)
> >
> >
> > On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman <bbockelm@cse.unl.edu
> > >wrote:
> >
> > >
> > > On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote:
> > >
> > > > Hi, thanks for this fast answer :)
> > > > If so, what do you mean by blocks? If a file has to be splitted, it
> > will
> > > be
> > > > splitted when larger than 64MB?
> > > >
> > >
> > > For every 64MB of the file, Hadoop will create a separate block.  So,
> if
> > > you have a 32KB file, there will be one block of 32KB.  If the file is
> > 65MB,
> > > then it will have one block of 64MB and another block of 1MB.
> > >
> > > Splitting files is very useful for load-balancing and distributing I/O
> > > across multiple nodes.  At 32KB / file, you don't really need to split
> > the
> > > files at all.
> > >
> > > I recommend reading the HDFS design document for background issues like
> > > this:
> > >
> > > http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html
> > >
> > > Brian
> > >
> > > >
> > > >
> > > >
> > > > On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman <
> bbockelm@cse.unl.edu
> > > >wrote:
> > > >
> > > >> Hey Pierre,
> > > >>
> > > >> These are not traditional filesystem blocks - if you save a file
> > smaller
> > > >> than 64MB, you don't lose 64MB of file space..
> > > >>
> > > >> Hadoop will use 32KB to store a 32KB file (ok, plus a KB of metadata
> > or
> > > >> so), not 64MB.
> > > >>
> > > >> Brian
> > > >>
> > > >> On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:
> > > >>
> > > >>> Hi,
> > > >>> I'm porting a legacy application to hadoop and it uses a bunch of
> > small
> > > >>> files.
> > > >>> I'm aware that having such small files ain't a good idea but I'm
> not
> > > >> doing
> > > >>> the technical decisions and the port has to be done for
> yesterday...
> > > >>> Of course such small files are a problem, loading 64MB blocks for a
> > few
> > > >>> lines of text is an evident loss.
> > > >>> What will happen if I set a smaller, or even way smaller (32kB)
> > blocks?
> > > >>>
> > > >>> Thank you.
> > > >>>
> > > >>> Pierre ANCELOT.
> > > >>
> > > >>
> > > >
> > > >
> > > > --
> > > > http://www.neko-consulting.com
> > > > Ego sum quis ego servo
> > > > "Je suis ce que je protège"
> > > > "I am what I protect"
> > >
> > >
> >
> >
> > --
> > http://www.neko-consulting.com
> > Ego sum quis ego servo
> > "Je suis ce que je protège"
> > "I am what I protect"
> >
>



-- 
http://www.neko-consulting.com
Ego sum quis ego servo
"Je suis ce que je protège"
"I am what I protect"

Re: Any possible to set hdfs block size to a value smaller than 64MB?

Posted by Patrick Angeles <pa...@cloudera.com>.

Pierre,

Adding to what Brian has said (some things are not explicitly mentioned in
the HDFS design doc)...

- If you have small files that take up < 64MB you do not actually use the
entire 64MB block on disk.
- You *do* use up RAM on the NameNode, as each block represents meta-data
that needs to be maintained in-memory in the NameNode.
- Hadoop won't perform optimally with very small block sizes. Hadoop I/O is
optimized for high sustained throughput per single file/block. There is a
penalty for doing too many seeks to get to the beginning of each block.
Additionally, you will have a MapReduce task per small file. Each MapReduce
task has a non-trivial startup overhead.
- The recommendation is to consolidate your small files into large files.
One way to do this is via SequenceFiles... put the filename in the
SequenceFile key field, and the file's bytes in the SequenceFile value
field.

In addition to the HDFS design docs, I recommend reading this blog post:
http://www.cloudera.com/blog/2009/02/the-small-files-problem/

Happy Hadooping,

- Patrick

On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT <pi...@gmail.com> wrote:

> Okay, thank you :)
>
>
> On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman <bbockelm@cse.unl.edu
> >wrote:
>
> >
> > On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote:
> >
> > > Hi, thanks for this fast answer :)
> > > If so, what do you mean by blocks? If a file has to be splitted, it
> will
> > be
> > > splitted when larger than 64MB?
> > >
> >
> > For every 64MB of the file, Hadoop will create a separate block.  So, if
> > you have a 32KB file, there will be one block of 32KB.  If the file is
> 65MB,
> > then it will have one block of 64MB and another block of 1MB.
> >
> > Splitting files is very useful for load-balancing and distributing I/O
> > across multiple nodes.  At 32KB / file, you don't really need to split
> the
> > files at all.
> >
> > I recommend reading the HDFS design document for background issues like
> > this:
> >
> > http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html
> >
> > Brian
> >
> > >
> > >
> > >
> > > On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman <bbockelm@cse.unl.edu
> > >wrote:
> > >
> > >> Hey Pierre,
> > >>
> > >> These are not traditional filesystem blocks - if you save a file
> smaller
> > >> than 64MB, you don't lose 64MB of file space..
> > >>
> > >> Hadoop will use 32KB to store a 32KB file (ok, plus a KB of metadata
> or
> > >> so), not 64MB.
> > >>
> > >> Brian
> > >>
> > >> On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:
> > >>
> > >>> Hi,
> > >>> I'm porting a legacy application to hadoop and it uses a bunch of
> small
> > >>> files.
> > >>> I'm aware that having such small files ain't a good idea but I'm not
> > >> doing
> > >>> the technical decisions and the port has to be done for yesterday...
> > >>> Of course such small files are a problem, loading 64MB blocks for a
> few
> > >>> lines of text is an evident loss.
> > >>> What will happen if I set a smaller, or even way smaller (32kB)
> blocks?
> > >>>
> > >>> Thank you.
> > >>>
> > >>> Pierre ANCELOT.
> > >>
> > >>
> > >
> > >
> > > --
> > > http://www.neko-consulting.com
> > > Ego sum quis ego servo
> > > "Je suis ce que je protège"
> > > "I am what I protect"
> >
> >
>
>
> --
> http://www.neko-consulting.com
> Ego sum quis ego servo
> "Je suis ce que je protège"
> "I am what I protect"
>

Re: Any possible to set hdfs block size to a value smaller than 64MB?

Posted by Pierre ANCELOT <pi...@gmail.com>.

Okay, thank you :)


On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman <bb...@cse.unl.edu>wrote:

>
> On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote:
>
> > Hi, thanks for this fast answer :)
> > If so, what do you mean by blocks? If a file has to be splitted, it will
> be
> > splitted when larger than 64MB?
> >
>
> For every 64MB of the file, Hadoop will create a separate block.  So, if
> you have a 32KB file, there will be one block of 32KB.  If the file is 65MB,
> then it will have one block of 64MB and another block of 1MB.
>
> Splitting files is very useful for load-balancing and distributing I/O
> across multiple nodes.  At 32KB / file, you don't really need to split the
> files at all.
>
> I recommend reading the HDFS design document for background issues like
> this:
>
> http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html
>
> Brian
>
> >
> >
> >
> > On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman <bbockelm@cse.unl.edu
> >wrote:
> >
> >> Hey Pierre,
> >>
> >> These are not traditional filesystem blocks - if you save a file smaller
> >> than 64MB, you don't lose 64MB of file space..
> >>
> >> Hadoop will use 32KB to store a 32KB file (ok, plus a KB of metadata or
> >> so), not 64MB.
> >>
> >> Brian
> >>
> >> On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:
> >>
> >>> Hi,
> >>> I'm porting a legacy application to hadoop and it uses a bunch of small
> >>> files.
> >>> I'm aware that having such small files ain't a good idea but I'm not
> >> doing
> >>> the technical decisions and the port has to be done for yesterday...
> >>> Of course such small files are a problem, loading 64MB blocks for a few
> >>> lines of text is an evident loss.
> >>> What will happen if I set a smaller, or even way smaller (32kB) blocks?
> >>>
> >>> Thank you.
> >>>
> >>> Pierre ANCELOT.
> >>
> >>
> >
> >
> > --
> > http://www.neko-consulting.com
> > Ego sum quis ego servo
> > "Je suis ce que je protège"
> > "I am what I protect"
>
>


-- 
http://www.neko-consulting.com
Ego sum quis ego servo
"Je suis ce que je protège"
"I am what I protect"

Re: Any possible to set hdfs block size to a value smaller than 64MB?

Posted by Brian Bockelman <bb...@cse.unl.edu>.

On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote:

> Hi, thanks for this fast answer :)
> If so, what do you mean by blocks? If a file has to be splitted, it will be
> splitted when larger than 64MB?
> 

For every 64MB of the file, Hadoop will create a separate block.  So, if you have a 32KB file, there will be one block of 32KB.  If the file is 65MB, then it will have one block of 64MB and another block of 1MB.

Splitting files is very useful for load-balancing and distributing I/O across multiple nodes.  At 32KB / file, you don't really need to split the files at all.

I recommend reading the HDFS design document for background issues like this:

http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html

Brian

> 
> 
> 
> On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman <bb...@cse.unl.edu>wrote:
> 
>> Hey Pierre,
>> 
>> These are not traditional filesystem blocks - if you save a file smaller
>> than 64MB, you don't lose 64MB of file space..
>> 
>> Hadoop will use 32KB to store a 32KB file (ok, plus a KB of metadata or
>> so), not 64MB.
>> 
>> Brian
>> 
>> On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:
>> 
>>> Hi,
>>> I'm porting a legacy application to hadoop and it uses a bunch of small
>>> files.
>>> I'm aware that having such small files ain't a good idea but I'm not
>> doing
>>> the technical decisions and the port has to be done for yesterday...
>>> Of course such small files are a problem, loading 64MB blocks for a few
>>> lines of text is an evident loss.
>>> What will happen if I set a smaller, or even way smaller (32kB) blocks?
>>> 
>>> Thank you.
>>> 
>>> Pierre ANCELOT.
>> 
>> 
> 
> 
> -- 
> http://www.neko-consulting.com
> Ego sum quis ego servo
> "Je suis ce que je protège"
> "I am what I protect"

Re: Any possible to set hdfs block size to a value smaller than 64MB?

Posted by Pierre ANCELOT <pi...@gmail.com>.

Hi, thanks for this fast answer :)
If so, what do you mean by blocks? If a file has to be splitted, it will be
splitted when larger than 64MB?




On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman <bb...@cse.unl.edu>wrote:

> Hey Pierre,
>
> These are not traditional filesystem blocks - if you save a file smaller
> than 64MB, you don't lose 64MB of file space..
>
> Hadoop will use 32KB to store a 32KB file (ok, plus a KB of metadata or
> so), not 64MB.
>
> Brian
>
> On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:
>
> > Hi,
> > I'm porting a legacy application to hadoop and it uses a bunch of small
> > files.
> > I'm aware that having such small files ain't a good idea but I'm not
> doing
> > the technical decisions and the port has to be done for yesterday...
> > Of course such small files are a problem, loading 64MB blocks for a few
> > lines of text is an evident loss.
> > What will happen if I set a smaller, or even way smaller (32kB) blocks?
> >
> > Thank you.
> >
> > Pierre ANCELOT.
>
>


-- 
http://www.neko-consulting.com
Ego sum quis ego servo
"Je suis ce que je protège"
"I am what I protect"

Re: Any possible to set hdfs block size to a value smaller than 64MB?

Posted by Brian Bockelman <bb...@cse.unl.edu>.

Hey Pierre,

These are not traditional filesystem blocks - if you save a file smaller than 64MB, you don't lose 64MB of file space..

Hadoop will use 32KB to store a 32KB file (ok, plus a KB of metadata or so), not 64MB.

Brian

On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:

> Hi,
> I'm porting a legacy application to hadoop and it uses a bunch of small
> files.
> I'm aware that having such small files ain't a good idea but I'm not doing
> the technical decisions and the port has to be done for yesterday...
> Of course such small files are a problem, loading 64MB blocks for a few
> lines of text is an evident loss.
> What will happen if I set a smaller, or even way smaller (32kB) blocks?
> 
> Thank you.
> 
> Pierre ANCELOT.