You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Dongsheng Wang <ph...@yahoo.com> on 2007/04/27 21:47:46 UTC

help needed for hadoop

One of my task is to calculate some statistics from a very large amount of log files for our customers. We are trying out hadoop to solve this problem. 
>From what I can see, it is a perfect problem for hadoop designed to solve.

The mapper and reducer code are very straight ward. But when we try to run it on a two node cluster, it is surprisingly slow. It has been running for three hours and did not finish half of 250M log files. And, I am not seeing much disk or network usage.

I want to know if there is something I should check. (Maybe some configurations?)

Any help will be appreciated. If there is more information needed, let me know. 

Thanks in advance

       
---------------------------------
Ahhh...imagining that irresistible "new car" smell?
 Check outnew cars at Yahoo! Autos.