You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Raja Nagendra Kumar <Na...@tejasoft.com> on 2011/07/17 04:07:00 UTC

Re: Poor IO performance on a 10 node cluster.

Hi,

Is this the speed you are observing when initial writes of the files
happening (i.e while you are initially putting 10gb files with replication)

Regards,
Raja Nagendra Kumar


Gyuribácsi wrote:
> 
>  
> Hi,
> 
> I have a 10 node cluster (IBM blade servers, 48GB RAM, 2x500GB Disk, 16 HT
> cores).
> 
> I've uploaded 10 files to HDFS. Each file is 10GB. I used the streaming
> jar
> with 'wc -l' as mapper and 'cat' as reducer.
> 
> I use 64MB block size and the default replication (3).
> 
> The wc on the 100 GB took about 220 seconds which translates to about 3.5
> Gbit/sec processing speed. One disk can do sequential read with 1Gbit/sec
> so
> i would expect someting around 20 GBit/sec (minus some overhead), and I'm
> getting only 3.5.
> 
> Is my expectaion valid?
> 
> I checked the jobtracked and it seems all nodes are working, each reading
> the right blocks. I have not played with the number of mapper and reducers
> yet. It seems the number of mappers is the same as the number of blocks
> and
> the number of reducers is 20 (there are 20 disks). This looks ok for me.
> 
> We also did an experiment with TestDFSIO with similar results. Aggregated
> read io speed is around 3.5Gbit/sec. It is just too far from my
> expectation:( 
> 
> Please help!
> 
> Thank you,
> Gyorgy
> 

-- 
View this message in context: http://old.nabble.com/Poor-IO-performance-on-a-10-node-cluster.-tp31732971p32076106.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.