You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Khai Cher LIM (NYP)" <LI...@nyp.edu.sg> on 2013/11/08 06:10:06 UTC

Unable to perform terasort for 50GB of data

Dear all,

I have just started learning Hadoop setup and I am having problem with running terasort on my Hadoop cluster. My input folder contains 50 GB of data but when I run the terasort, the tasks failed and it gave me the error message as shown in the following screenshot.

[cid:image001.png@01CEDC83.D789EC90]

I've set my dfs block size to be 128 MB. Actually with the default 64 MB, the tasks failed also with the same reason.

Server information - HP ProLiant DL380p Gen8 (2U)
*             two Intel Xeon E5-2640 processors with 15 MB cache, 2.5Ghz, 7.2GT/s
*             48GB RAM
*             12 x 1TB (or a raw capacity of 12TB) 6G SAS 7.2K 3.5 HDD
*             RAID controller that supports RAID 5 with at least 512MB Flash-Backed Write Cache (FBWC)
*             on-board adapter of 4 x 1GbE Ethernet port
*             2 hot-pluggable power supply units

I've configured two servers with virtual machines as decribed below:
Server 1:
1 Name Node - 32 GB RAM, 300 GB HDD space
4 Data Nodes - 16 GB RAM, 300 GB HDD space

Server 2:
1 Secondary Name Node - 32 GB RAM, 300 GB HDD space
4 Data Nodes - 16 GB RAM, 300 GB HDD space

I've checked that the diskspace used per data node is about 20% on average. Thus I couldn't understand the error message complaining about "no space left on device".

Any help is much appreciated.

Thank you.

Regards,
Khai Cher


Re: Unable to perform terasort for 50GB of data

Posted by inelu nagamallikarjuna <ma...@gmail.com>.
Hai,

Check the individual data nodes usage:
Hadoop dfsadmin -report
And moreover override the config parameter mapred.local.dir to store
intermediate data in some path rather than /tmp directory and don't use
single reducer, increase no of reducers and use  totalorderpartitioner

Thanks
Nagamallikarjuna
On Nov 8, 2013 10:40 AM, "Khai Cher LIM (NYP)" <LI...@nyp.edu.sg>
wrote:

>  Dear all,
>
>
>
> I have just started learning Hadoop setup and I am having problem with
> running terasort on my Hadoop cluster. My input folder contains 50 GB of
> data but when I run the terasort, the tasks failed and it gave me the error
> message as shown in the following screenshot.
>
>
>
>
>
> I've set my dfs block size to be 128 MB. Actually with the default 64 MB,
> the tasks failed also with the same reason.
>
>
>
> Server information - HP ProLiant DL380p Gen8 (2U)
>
> •             two Intel Xeon E5-2640 processors with 15 MB cache, 2.5Ghz,
> 7.2GT/s
>
> •             48GB RAM
>
> •             12 x 1TB (or a raw capacity of 12TB) 6G SAS 7.2K 3.5 HDD
>
> •             RAID controller that supports RAID 5 with at least 512MB
> Flash-Backed Write Cache (FBWC)
>
> •             on-board adapter of 4 x 1GbE Ethernet port
>
> •             2 hot-pluggable power supply units
>
>
>
> I've configured two servers with virtual machines as decribed below:
>
> Server 1:
>
> 1 Name Node - 32 GB RAM, 300 GB HDD space
>
> 4 Data Nodes - 16 GB RAM, 300 GB HDD space
>
>
>
> Server 2:
>
> 1 Secondary Name Node - 32 GB RAM, 300 GB HDD space
>
> 4 Data Nodes - 16 GB RAM, 300 GB HDD space
>
>
>
> I've checked that the diskspace used per data node is about 20% on
> average. Thus I couldn't understand the error message complaining about "no
> space left on device".
>
>
>
> Any help is much appreciated.
>
>
>
> Thank you.
>
>
>
> Regards,
>
> Khai Cher
>
>
>