You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Stas Oskin <st...@gmail.com> on 2010/01/02 23:54:08 UTC

Re: HDFS read/write speeds, and read optimization

Hi.

Can anyone advice on the subject below?

Thanks!

On Mon, Dec 28, 2009 at 9:01 PM, Stas Oskin <st...@gmail.com> wrote:

> Hi.
>
> Going back to the subject, has anyone ever bench-marked small (10 - 20
> node) HDFS clusters?
>
> I did my own speed checks, and it seems I can reach ~77Mbps, on a quad-disk
> node. This comes to ~19Mbps per disk, which seems quite low in my opinion.
>
> Can anyone advice about this?
>
> Thanks.
>

Re: HDFS read/write speeds, and read optimization

Posted by Stas Oskin <st...@gmail.com>.
Hi again.

By the way, I forgot to mention that I do the tests on same machines that
serve as DataNodes. i.e. same machine acts both like as a client and
DataNode.

Regards.

Re: HDFS read/write speeds, and read optimization

Posted by Eli Collins <el...@cloudera.com>.
> I actually tested it with a simple Java test loader I quickly put together,
> which ran on each machine and continuously has written random data to DFS. I
> tuned the writing rate until I got ~77Mb/s - above it the iowait loads on
> each disk (measured by iostat) became above 50% - 60%, which is quite close
> to disks limits.

How many DNs are you using? How many copies of the benchmark are you
running? What results do you get just running a single copy of the
benchmark?

I see ~46 MB/s hadoop fs put'ing a local 1gb file from one DN, using
3-way replication. Running the test on three DNs I get around 30 MB/s.
This is a little less than half the theoretical limit (using three
hosts each with a single gigabit nic). In these tests I purged the
buffer cache before running the test, with the input file cached in
memory (more similar to your test) I get 92 MB/s on one host but about
the same rate for three hosts (we're network bound). This is about 3x
faster than what you're seeing so I suspect something's up with your
test. Would be useful for you to see what results you get running the
same test I did.

> You mentioned some TestDFSIO, any idea if it's present in 0.18.3?

It's in 0.18.3 See src/test/org/apache/hadoop/fs/TestDFSIO.java

Thanks,
Eli

Re: HDFS read/write speeds, and read optimization

Posted by Stas Oskin <st...@gmail.com>.
Hi.

Can you provide more information about your workload and the
> environment? eg are you running t.o.a.h.h.BenchmarkThroughput,
> TestDFSIO, or timing hadoop fs -put/get to transfer data to hdfs from
> another machine, looking at metrics, etc. What else is running on the
> cluster? Have you profiled? etc. 77Mb/s (<10MB/s) seems low but w/o
> context is not meaningful.
>


I actually tested it with a simple Java test loader I quickly put together,
which ran on each machine and continuously has written random data to DFS. I
tuned the writing rate until I got ~77Mb/s - above it the iowait loads on
each disk (measured by iostat) became above 50% - 60%, which is quite close
to disks limits.

If there is a more official / better way to do it, I'll be happy to get some
pointers to it.
You mentioned some TestDFSIO, any idea if it's present in 0.18.3?

Regards.

Re: HDFS read/write speeds, and read optimization

Posted by Eli Collins <el...@cloudera.com>.
Hey Stas,

Can you provide more information about your workload and the
environment? eg are you running t.o.a.h.h.BenchmarkThroughput,
TestDFSIO, or timing hadoop fs -put/get to transfer data to hdfs from
another machine, looking at metrics, etc. What else is running on the
cluster? Have you profiled? etc. 77Mb/s (<10MB/s) seems low but w/o
context is not meaningful.

Thanks,
Eli

On Sat, Jan 2, 2010 at 2:54 PM, Stas Oskin <st...@gmail.com> wrote:
> Hi.
>
> Can anyone advice on the subject below?
>
> Thanks!
>
> On Mon, Dec 28, 2009 at 9:01 PM, Stas Oskin <st...@gmail.com> wrote:
>
>> Hi.
>>
>> Going back to the subject, has anyone ever bench-marked small (10 - 20
>> node) HDFS clusters?
>>
>> I did my own speed checks, and it seems I can reach ~77Mbps, on a quad-disk
>> node. This comes to ~19Mbps per disk, which seems quite low in my opinion.
>>
>> Can anyone advice about this?
>>
>> Thanks.
>>
>

Re: HDFS read/write speeds, and read optimization

Posted by Stas Oskin <st...@gmail.com>.
Hi.

We run with 2-way replication.  The wonderful folks at Yahoo! worked through
> most of the bugs during 0.19.x IIRC.  There was never any bugs with 2-way
> replication per-se, but running a cluster with 2 replicas exposed other bugs
> at a 100x rate compared to running with 3 replicas (due to the fact that a
> silent corruption + loss of a single data node = file loss).
>
> I'd estimate we lose files at a rate of about 1 per month for 200TB of
> actual data.  That number would probably go down an order of magnitude or
> more if we were running with 3 replicas.
>
> Hope this helps.
>
>
Thanks for sharing!

So, there is a good reason to believe, that version 0.19 and higher have the
file storage / silent corruption issues sorted out?

Regards.

Re: HDFS read/write speeds, and read optimization

Posted by Brian Bockelman <bb...@cse.unl.edu>.
Hi,

We run with 2-way replication.  The wonderful folks at Yahoo! worked through most of the bugs during 0.19.x IIRC.  There was never any bugs with 2-way replication per-se, but running a cluster with 2 replicas exposed other bugs at a 100x rate compared to running with 3 replicas (due to the fact that a silent corruption + loss of a single data node = file loss).

I'd estimate we lose files at a rate of about 1 per month for 200TB of actual data.  That number would probably go down an order of magnitude or more if we were running with 3 replicas.

Hope this helps.

Brian

On Jan 10, 2010, at 3:55 AM, Eli Collins wrote:

>> data.replication = 2
>> 
>> A bit of topic - is it safe to have such number? About a year ago I heard
>> only 3 way replication was fully tested, while 2 way had some issues - was
>> it fixed in subsequent versions?
> 
> I think that's still a relatively untested configuration, though I'm
> not aware of any known bugs with it. I know of at least one cluster
> that uses 2-way replication.  Note that 3-way replication is used both
> for availability and performance, though in a write benchmark 2-way
> replication should be faster than 3-way.
> 
> Thanks,
> Eli


Re: HDFS read/write speeds, and read optimization

Posted by Eli Collins <el...@cloudera.com>.
> data.replication = 2
>
> A bit of topic - is it safe to have such number? About a year ago I heard
> only 3 way replication was fully tested, while 2 way had some issues - was
> it fixed in subsequent versions?

I think that's still a relatively untested configuration, though I'm
not aware of any known bugs with it. I know of at least one cluster
that uses 2-way replication.  Note that 3-way replication is used both
for availability and performance, though in a write benchmark 2-way
replication should be faster than 3-way.

Thanks,
Eli

Re: HDFS read/write speeds, and read optimization

Posted by Stas Oskin <st...@gmail.com>.
Hi.

Also, It would be interesting to know "data.replication" setting you have
> for this benchmark?
>
>
data.replication = 2

A bit of topic - is it safe to have such number? About a year ago I heard
only 3 way replication was fully tested, while 2 way had some issues - was
it fixed in subsequent versions?

Regards.

Re: HDFS read/write speeds, and read optimization

Posted by Rajesh Balamohan <ra...@gmail.com>.
Also, It would be interesting to know "data.replication" setting you have
for this benchmark?

On Sun, Jan 3, 2010 at 8:51 AM, Andreas Kostyrka <an...@kostyrka.org>wrote:

> Well, that all depends on many details, but:
>
> -) are you really using 4 discs (configured correctly as data
> directories?)
>
> -) What hdd/connection technology?
>
> -) And 77MB/s would match up curiously well with 1Gbit networking cards?
> So you sure that you are testing a completely local setup? Where's your
> name node running then?
>
> Andreas
>
>
> Am Sonntag, den 03.01.2010, 00:54 +0200 schrieb Stas Oskin:
> > Hi.
> >
> > Can anyone advice on the subject below?
> >
> > Thanks!
> >
> > On Mon, Dec 28, 2009 at 9:01 PM, Stas Oskin <st...@gmail.com>
> wrote:
> >
> > > Hi.
> > >
> > > Going back to the subject, has anyone ever bench-marked small (10 - 20
> > > node) HDFS clusters?
> > >
> > > I did my own speed checks, and it seems I can reach ~77Mbps, on a
> quad-disk
> > > node. This comes to ~19Mbps per disk, which seems quite low in my
> opinion.
> > >
> > > Can anyone advice about this?
> > >
> > > Thanks.
> > >
>
>


-- 
~Rajesh.B

Re: HDFS read/write speeds, and read optimization

Posted by Stas Oskin <st...@gmail.com>.
Hi.

Well, that all depends on many details, but:
>
> -) are you really using 4 discs (configured correctly as data
> directories?)
>
>
Yes, 4 directories, one per each disk.


> -) What hdd/connection technology?
>
>
SATA 3Gbp/s


> -) And 77MB/s would match up curiously well with 1Gbit networking cards?
> So you sure that you are testing a completely local setup? Where's your
> name node running then?
>
>
I actually mixed this with 77Mbp/s (bits, not bytes), sorry for confusion.

Regards.

Re: HDFS read/write speeds, and read optimization

Posted by Andreas Kostyrka <an...@kostyrka.org>.
Well, that all depends on many details, but:

-) are you really using 4 discs (configured correctly as data
directories?)

-) What hdd/connection technology?

-) And 77MB/s would match up curiously well with 1Gbit networking cards?
So you sure that you are testing a completely local setup? Where's your
name node running then?

Andreas 


Am Sonntag, den 03.01.2010, 00:54 +0200 schrieb Stas Oskin:
> Hi.
> 
> Can anyone advice on the subject below?
> 
> Thanks!
> 
> On Mon, Dec 28, 2009 at 9:01 PM, Stas Oskin <st...@gmail.com> wrote:
> 
> > Hi.
> >
> > Going back to the subject, has anyone ever bench-marked small (10 - 20
> > node) HDFS clusters?
> >
> > I did my own speed checks, and it seems I can reach ~77Mbps, on a quad-disk
> > node. This comes to ~19Mbps per disk, which seems quite low in my opinion.
> >
> > Can anyone advice about this?
> >
> > Thanks.
> >