You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by stephen mulcahy <st...@deri.org> on 2010/05/11 16:33:30 UTC

Re: Hadoop performance - xfs and ext4

On 23/04/10 15:43, Todd Lipcon wrote:
> Hi Stephen,
>
> Can you try mounting ext4 with the nodelalloc option? I've seen the same
> improvement due to delayed allocation butbeen a little nervous about that
> option (especially in the NN where we currently follow what the kernel
> people call an antipattern for image rotation).

Hi Todd,

Sorry for the delayed response - I had to wait for another test window 
before trying this out.

To clarify, my namename and secondary namenode have been using ext4 in 
all tests - reconfiguring the datanodes is a fast operation, the nn and 
2nn less so. I figure any big performance benefit would appear on the 
data nodes anyway and can then apply it back to the nn and 2nn if 
testing shows any benefits in changing.

So I tried running our datanodes with their ext4 filesystems mounted 
using "noatime,nodelalloc" and after 6 runs of the TeraSort, it seems it 
runs SLOWER with those options by between 5-8%. The TeraGen itself 
seemed to run about 5% faster but it was only a single run so I'm not 
sure how reliable that is.

hth,

-stephen

-- 
Stephen Mulcahy, DI2, Digital Enterprise Research Institute,
NUI Galway, IDA Business Park, Lower Dangan, Galway, Ireland
http://di2.deri.ie    http://webstar.deri.ie    http://sindice.com

Re: Hadoop performance - xfs and ext4

Posted by Edward Capriolo <ed...@gmail.com>.
On Tue, May 11, 2010 at 10:39 AM, Todd Lipcon <to...@cloudera.com> wrote:

> On Tue, May 11, 2010 at 7:33 AM, stephen mulcahy
> <st...@deri.org>wrote:
>
> > On 23/04/10 15:43, Todd Lipcon wrote:
> >
> >> Hi Stephen,
> >>
> >> Can you try mounting ext4 with the nodelalloc option? I've seen the same
> >> improvement due to delayed allocation butbeen a little nervous about
> that
> >> option (especially in the NN where we currently follow what the kernel
> >> people call an antipattern for image rotation).
> >>
> >
> > Hi Todd,
> >
> > Sorry for the delayed response - I had to wait for another test window
> > before trying this out.
> >
> > To clarify, my namename and secondary namenode have been using ext4 in
> all
> > tests - reconfiguring the datanodes is a fast operation, the nn and 2nn
> less
> > so. I figure any big performance benefit would appear on the data nodes
> > anyway and can then apply it back to the nn and 2nn if testing shows any
> > benefits in changing.
> >
> > So I tried running our datanodes with their ext4 filesystems mounted
> using
> > "noatime,nodelalloc" and after 6 runs of the TeraSort, it seems it runs
> > SLOWER with those options by between 5-8%. The TeraGen itself seemed to
> run
> > about 5% faster but it was only a single run so I'm not sure how reliable
> > that is.
> >
>
> Yep, that's what I'd expect. noatime should be a small improvement,
> nodelalloc should be a small detriment. The thing is that delayed
> allocation
> has some strange cases that could theoretically cause data loss after a
> power outage, so I was interested to see if it nullified all of your
> performance gains or if it were just a small hit.
>
> -Todd
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>


For most people doing tuning of the disk configuration for the NameNode is
waisted time. Why? The current capacity of our hadoop cluster is

Present Capacity: 111148799678056 (101.09 TB)

Yet the NameNode data itself is tiny.

du -hs /usr/local/hadoop_root/hdfs_master
684M    /usr/local/hadoop_root/hdfs_master

Likely the entire Node table fits entirely inside the VFS cache, performance
is not usually an issue, reliability is. The more exotic you get with this
mount (EXT5, rarely used mount options), the less reliable it is going to be
(IMHO). This is because your configuration space is not shared by that many
people.

DataNodes are a different story. These are worth tuning. I suggest
configuring a single datanode as (say EXT4 with fancy options x,y,z), Wait a
while get real production load at it, then look at some performance data and
see if this node has any tangible difference in performance. Do not look for
low level things like, bonnie say delete rate is +5& but create rate  -%5.
Look at the big picture, if you can't see a tangible big picture difference
like ' map jobs seem to finish 5% faster on this node' what are you doing
the tuning for :) ?

I know this seems like a rather un-scientific approach, but disk
tuning/performance measuring is very complex because application, VFS cache,
available memory are the critical factors performance.

Re: Hadoop performance - xfs and ext4

Posted by Todd Lipcon <to...@cloudera.com>.
On Tue, May 11, 2010 at 7:33 AM, stephen mulcahy
<st...@deri.org>wrote:

> On 23/04/10 15:43, Todd Lipcon wrote:
>
>> Hi Stephen,
>>
>> Can you try mounting ext4 with the nodelalloc option? I've seen the same
>> improvement due to delayed allocation butbeen a little nervous about that
>> option (especially in the NN where we currently follow what the kernel
>> people call an antipattern for image rotation).
>>
>
> Hi Todd,
>
> Sorry for the delayed response - I had to wait for another test window
> before trying this out.
>
> To clarify, my namename and secondary namenode have been using ext4 in all
> tests - reconfiguring the datanodes is a fast operation, the nn and 2nn less
> so. I figure any big performance benefit would appear on the data nodes
> anyway and can then apply it back to the nn and 2nn if testing shows any
> benefits in changing.
>
> So I tried running our datanodes with their ext4 filesystems mounted using
> "noatime,nodelalloc" and after 6 runs of the TeraSort, it seems it runs
> SLOWER with those options by between 5-8%. The TeraGen itself seemed to run
> about 5% faster but it was only a single run so I'm not sure how reliable
> that is.
>

Yep, that's what I'd expect. noatime should be a small improvement,
nodelalloc should be a small detriment. The thing is that delayed allocation
has some strange cases that could theoretically cause data loss after a
power outage, so I was interested to see if it nullified all of your
performance gains or if it were just a small hit.

-Todd

-- 
Todd Lipcon
Software Engineer, Cloudera