You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by sagar naik <sn...@attributor.com> on 2010/03/31 00:41:35 UTC

Hadoop DFS IO Performance measurement

Hi All,

I am trying to get DFS IO performance.
I used TestDFSIO from hadoop jars.
The results were abt 100Mbps read and write .
I think it should be more than this

Pl share some stats to compare

Either I am missing something like  config params or something else


-Sagar

Re: Hadoop DFS IO Performance measurement

Posted by Jason Venner <ja...@gmail.com>.
I completely forgot that the raid controller can be a bottle neck as
can the disk connection strategy
I don't remember what the PERC's top out at, and I don't recall the
aggregate actual bandwidth available for the disks.
I have a simple SSD that I get steady 100MB/sec out of with sata 1, I
would guess that sata 2 tops out about 150MB/sec in the real world.

Exercise each of your components in isolation

ie: dd if=/dev/MY_RAID0 of=/dev/null bs=64k count=100000
to get an idea of what the disk subsystem can deliver

the dfs client passes everything through the socket layer which adds
additional copying and latency.

On Wed, Mar 31, 2010 at 6:31 PM, Jason Venner <ja...@gmail.com> wrote:
> Unless you are getting all local IO, and or you have better than GigE
> nic interfaces
> 100MB/sec is your cap.
>
> For local IO the bound is going to be your storage subsystem.
> Decent drives in a raid 0 interface are going to cap out on those
> machines about 400MB/sec, which is the buffer cache bandwidth on those
> processors/memory.
> realistically you are going to see a quite a bit less, but 200 should be doable.
>
>
> On Wed, Mar 31, 2010 at 2:55 PM, Sagar Naik <sn...@attributor.com> wrote:
>> Hi Edson,
>>
>> usual commodity machine :
>> 8GB Ram, 2.5GHZ Intel Xeon,
>>
>> 6 Disks : 2 drives RAID 1 , 4 RAID 0 using PERC 6
>>
>> Centos , ext4
>>
>> Datanodes configured to use 4 RAID 0 drives
>> OS and hadoop installation on RAID 1
>>
>> -Sagar
>> On Mar 30, 2010, at 4:21 PM, Edson Ramiro wrote:
>>
>>> Hi Sagar,
>>>
>>> What hardware did you run it on ?
>>>
>>> Edson Ramiro
>>>
>>>
>>> On 30 March 2010 19:41, sagar naik <sn...@attributor.com> wrote:
>>>
>>>> Hi All,
>>>>
>>>> I am trying to get DFS IO performance.
>>>> I used TestDFSIO from hadoop jars.
>>>> The results were abt 100Mbps read and write .
>>>> I think it should be more than this
>>>>
>>>> Pl share some stats to compare
>>>>
>>>> Either I am missing something like  config params or something else
>>>>
>>>>
>>>> -Sagar
>>>>
>>
>>
>
>
>
> --
> Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> http://www.amazon.com/dp/1430219424?tag=jewlerymall
> www.prohadoopbook.com a community for Hadoop Professionals
>



-- 
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.amazon.com/dp/1430219424?tag=jewlerymall
www.prohadoopbook.com a community for Hadoop Professionals

Re: Hadoop DFS IO Performance measurement

Posted by Jason Venner <ja...@gmail.com>.
Unless you are getting all local IO, and or you have better than GigE
nic interfaces
100MB/sec is your cap.

For local IO the bound is going to be your storage subsystem.
Decent drives in a raid 0 interface are going to cap out on those
machines about 400MB/sec, which is the buffer cache bandwidth on those
processors/memory.
realistically you are going to see a quite a bit less, but 200 should be doable.


On Wed, Mar 31, 2010 at 2:55 PM, Sagar Naik <sn...@attributor.com> wrote:
> Hi Edson,
>
> usual commodity machine :
> 8GB Ram, 2.5GHZ Intel Xeon,
>
> 6 Disks : 2 drives RAID 1 , 4 RAID 0 using PERC 6
>
> Centos , ext4
>
> Datanodes configured to use 4 RAID 0 drives
> OS and hadoop installation on RAID 1
>
> -Sagar
> On Mar 30, 2010, at 4:21 PM, Edson Ramiro wrote:
>
>> Hi Sagar,
>>
>> What hardware did you run it on ?
>>
>> Edson Ramiro
>>
>>
>> On 30 March 2010 19:41, sagar naik <sn...@attributor.com> wrote:
>>
>>> Hi All,
>>>
>>> I am trying to get DFS IO performance.
>>> I used TestDFSIO from hadoop jars.
>>> The results were abt 100Mbps read and write .
>>> I think it should be more than this
>>>
>>> Pl share some stats to compare
>>>
>>> Either I am missing something like  config params or something else
>>>
>>>
>>> -Sagar
>>>
>
>



-- 
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.amazon.com/dp/1430219424?tag=jewlerymall
www.prohadoopbook.com a community for Hadoop Professionals

Re: Hadoop DFS IO Performance measurement

Posted by Sagar Naik <sn...@attributor.com>.
Hi Edson, 

usual commodity machine :
8GB Ram, 2.5GHZ Intel Xeon, 

6 Disks : 2 drives RAID 1 , 4 RAID 0 using PERC 6

Centos , ext4

Datanodes configured to use 4 RAID 0 drives
OS and hadoop installation on RAID 1

-Sagar 
On Mar 30, 2010, at 4:21 PM, Edson Ramiro wrote:

> Hi Sagar,
> 
> What hardware did you run it on ?
> 
> Edson Ramiro
> 
> 
> On 30 March 2010 19:41, sagar naik <sn...@attributor.com> wrote:
> 
>> Hi All,
>> 
>> I am trying to get DFS IO performance.
>> I used TestDFSIO from hadoop jars.
>> The results were abt 100Mbps read and write .
>> I think it should be more than this
>> 
>> Pl share some stats to compare
>> 
>> Either I am missing something like  config params or something else
>> 
>> 
>> -Sagar
>> 


Re: Hadoop DFS IO Performance measurement

Posted by Edson Ramiro <er...@gmail.com>.
Hi Sagar,

What hardware did you run it on ?

Edson Ramiro


On 30 March 2010 19:41, sagar naik <sn...@attributor.com> wrote:

> Hi All,
>
> I am trying to get DFS IO performance.
> I used TestDFSIO from hadoop jars.
> The results were abt 100Mbps read and write .
> I think it should be more than this
>
> Pl share some stats to compare
>
> Either I am missing something like  config params or something else
>
>
> -Sagar
>