You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Dieter Reuter <re...@googlemail.com> on 2011/01/05 21:26:34 UTC

Hardware requirement for HBase/Hadoop - looking for fast 1TB disks

I'm just trying to evaluate HBase/Hadoop on a small cluster.

For the very first tests I just set up a cluster of 6 nodes on a single ESXi
server, 1x Master/ZK/NN and 5x RS/DN. The setup process was quiet easy and
straight forward with CDH3b3 on CentOS 5.5. Now I'm able to play around with
the basic operation. But for real performance tests I'll have to add fast
disks and go for real hardware.

>From the list I learned, to increase the disk throughput just use more
disks/spindles. For this case I'd like to go for real nodes with 4x 1TB
disks per node. I think 2TB disks are slower, and I really don't need 8TB
per node for my POC. For now I just would select standard 7200 SATA disks
like WD1003FBYX, WD1002FAEX or similar.

But what type of 1TB disk gives me a good performance for a reasonable
price? I can't find any recent comparison for the usage pattern with
hbase/hadoop.

Any recommendations are welcome.
Thanks.

-- 
Dieter

Re: Hardware requirement for HBase/Hadoop - looking for fast 1TB disks

Posted by Dieter Reuter <re...@googlemail.com>.
John,
thanks for your answer and the link to the hard drive benchmark.
I've already found a comparison between the WD1002FAEX and the F3 HD103SJ
from Samsung. Performance is approx. the same, but the F3 is cheaper. Both
drives are not recommended for a 24x7 use, but for my POC I think it's quite
OK.

Dieter

On Fri, Jan 7, 2011 at 6:58 PM, John Overman <jb...@gmail.com> wrote:

> I would recommend Samsung Spinpoint F3 drives, which have slightly higher
> ratings than the WD1002FAEX on newegg, and are the fastest 1TB drive for
> streaming reads.  They're listed as $52.99+shipping ($49.65 for 10 drives)
> at CompUPlus.com (
>
> http://www.compuplus.com/Drives-and-storage/Samsung-1TB-Spinpoint-F3-7200RPM-1138634.html
> ),
> and If you contact a sales person, there's a discount for ordering more
> than
> 10.
>
> Here are some hard drive benchmarks:
>
> http://www.tomshardware.com/charts/2009-3.5-desktop-hard-drive-charts/benchmarks,50.html
>
> On Wed, Jan 5, 2011 at 2:55 PM, Krishna Sankar <ks...@gmail.com>
> wrote:
>
> > My favorites are the seagate 3.5" drives ST31000340AS & ST31000340NS.
> > Cheers
> > <k/>
> >
> > On 1/5/11 Wed Jan 5, 11, "Dieter Reuter" <re...@googlemail.com>
> > wrote:
> >
> > >I'm just trying to evaluate HBase/Hadoop on a small cluster.
> > >
> > >For the very first tests I just set up a cluster of 6 nodes on a single
> > >ESXi
> > >server, 1x Master/ZK/NN and 5x RS/DN. The setup process was quiet easy
> and
> > >straight forward with CDH3b3 on CentOS 5.5. Now I'm able to play around
> > >with
> > >the basic operation. But for real performance tests I'll have to add
> fast
> > >disks and go for real hardware.
> > >
> > >From the list I learned, to increase the disk throughput just use more
> > >disks/spindles. For this case I'd like to go for real nodes with 4x 1TB
> > >disks per node. I think 2TB disks are slower, and I really don't need
> 8TB
> > >per node for my POC. For now I just would select standard 7200 SATA
> disks
> > >like WD1003FBYX, WD1002FAEX or similar.
> > >
> > >But what type of 1TB disk gives me a good performance for a reasonable
> > >price? I can't find any recent comparison for the usage pattern with
> > >hbase/hadoop.
> > >
> > >Any recommendations are welcome.
> > >Thanks.
> > >
> > >--
> > >Dieter
> >
> >
> >
>

Re: Hardware requirement for HBase/Hadoop - looking for fast 1TB disks

Posted by John Overman <jb...@gmail.com>.
Sorry I don't have hbase specific benchmarks.  I think it still operates
with mostly streaming reads and writes, but it would probably depend on your
specific application.

On Fri, Jan 7, 2011 at 11:58 AM, John Overman <jb...@gmail.com> wrote:

> I would recommend Samsung Spinpoint F3 drives, which have slightly higher
> ratings than the WD1002FAEX on newegg, and are the fastest 1TB drive for
> streaming reads.  They're listed as $52.99+shipping ($49.65 for 10 drives)
> at CompUPlus.com (
> http://www.compuplus.com/Drives-and-storage/Samsung-1TB-Spinpoint-F3-7200RPM-1138634.html),
> and If you contact a sales person, there's a discount for ordering more than
> 10.
>
> Here are some hard drive benchmarks:
>
> http://www.tomshardware.com/charts/2009-3.5-desktop-hard-drive-charts/benchmarks,50.html
>
>
> On Wed, Jan 5, 2011 at 2:55 PM, Krishna Sankar <ks...@gmail.com>wrote:
>
>> My favorites are the seagate 3.5" drives ST31000340AS & ST31000340NS.
>> Cheers
>> <k/>
>>
>> On 1/5/11 Wed Jan 5, 11, "Dieter Reuter" <re...@googlemail.com>
>> wrote:
>>
>> >I'm just trying to evaluate HBase/Hadoop on a small cluster.
>> >
>> >For the very first tests I just set up a cluster of 6 nodes on a single
>> >ESXi
>> >server, 1x Master/ZK/NN and 5x RS/DN. The setup process was quiet easy
>> and
>> >straight forward with CDH3b3 on CentOS 5.5. Now I'm able to play around
>> >with
>> >the basic operation. But for real performance tests I'll have to add fast
>> >disks and go for real hardware.
>> >
>> >From the list I learned, to increase the disk throughput just use more
>> >disks/spindles. For this case I'd like to go for real nodes with 4x 1TB
>> >disks per node. I think 2TB disks are slower, and I really don't need 8TB
>> >per node for my POC. For now I just would select standard 7200 SATA disks
>> >like WD1003FBYX, WD1002FAEX or similar.
>> >
>> >But what type of 1TB disk gives me a good performance for a reasonable
>> >price? I can't find any recent comparison for the usage pattern with
>> >hbase/hadoop.
>> >
>> >Any recommendations are welcome.
>> >Thanks.
>> >
>> >--
>> >Dieter
>>
>>
>>
>

Re: Hardware requirement for HBase/Hadoop - looking for fast 1TB disks

Posted by John Overman <jb...@gmail.com>.
I would recommend Samsung Spinpoint F3 drives, which have slightly higher
ratings than the WD1002FAEX on newegg, and are the fastest 1TB drive for
streaming reads.  They're listed as $52.99+shipping ($49.65 for 10 drives)
at CompUPlus.com (
http://www.compuplus.com/Drives-and-storage/Samsung-1TB-Spinpoint-F3-7200RPM-1138634.html),
and If you contact a sales person, there's a discount for ordering more than
10.

Here are some hard drive benchmarks:
http://www.tomshardware.com/charts/2009-3.5-desktop-hard-drive-charts/benchmarks,50.html

On Wed, Jan 5, 2011 at 2:55 PM, Krishna Sankar <ks...@gmail.com> wrote:

> My favorites are the seagate 3.5" drives ST31000340AS & ST31000340NS.
> Cheers
> <k/>
>
> On 1/5/11 Wed Jan 5, 11, "Dieter Reuter" <re...@googlemail.com>
> wrote:
>
> >I'm just trying to evaluate HBase/Hadoop on a small cluster.
> >
> >For the very first tests I just set up a cluster of 6 nodes on a single
> >ESXi
> >server, 1x Master/ZK/NN and 5x RS/DN. The setup process was quiet easy and
> >straight forward with CDH3b3 on CentOS 5.5. Now I'm able to play around
> >with
> >the basic operation. But for real performance tests I'll have to add fast
> >disks and go for real hardware.
> >
> >From the list I learned, to increase the disk throughput just use more
> >disks/spindles. For this case I'd like to go for real nodes with 4x 1TB
> >disks per node. I think 2TB disks are slower, and I really don't need 8TB
> >per node for my POC. For now I just would select standard 7200 SATA disks
> >like WD1003FBYX, WD1002FAEX or similar.
> >
> >But what type of 1TB disk gives me a good performance for a reasonable
> >price? I can't find any recent comparison for the usage pattern with
> >hbase/hadoop.
> >
> >Any recommendations are welcome.
> >Thanks.
> >
> >--
> >Dieter
>
>
>

Re: Hardware requirement for HBase/Hadoop - looking for fast 1TB disks

Posted by Krishna Sankar <ks...@gmail.com>.
My favorites are the seagate 3.5" drives ST31000340AS & ST31000340NS.
Cheers
<k/> 

On 1/5/11 Wed Jan 5, 11, "Dieter Reuter" <re...@googlemail.com>
wrote:

>I'm just trying to evaluate HBase/Hadoop on a small cluster.
>
>For the very first tests I just set up a cluster of 6 nodes on a single
>ESXi
>server, 1x Master/ZK/NN and 5x RS/DN. The setup process was quiet easy and
>straight forward with CDH3b3 on CentOS 5.5. Now I'm able to play around
>with
>the basic operation. But for real performance tests I'll have to add fast
>disks and go for real hardware.
>
>>From the list I learned, to increase the disk throughput just use more
>disks/spindles. For this case I'd like to go for real nodes with 4x 1TB
>disks per node. I think 2TB disks are slower, and I really don't need 8TB
>per node for my POC. For now I just would select standard 7200 SATA disks
>like WD1003FBYX, WD1002FAEX or similar.
>
>But what type of 1TB disk gives me a good performance for a reasonable
>price? I can't find any recent comparison for the usage pattern with
>hbase/hadoop.
>
>Any recommendations are welcome.
>Thanks.
>
>-- 
>Dieter