You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Dongsheng Wang <ph...@yahoo.com> on 2007/04/27 21:47:46 UTC
help needed for hadoop
One of my task is to calculate some statistics from a very large amount of log files for our customers. We are trying out hadoop to solve this problem.
>From what I can see, it is a perfect problem for hadoop designed to solve.
The mapper and reducer code are very straight ward. But when we try to run it on a two node cluster, it is surprisingly slow. It has been running for three hours and did not finish half of 250M log files. And, I am not seeing much disk or network usage.
I want to know if there is something I should check. (Maybe some configurations?)
Any help will be appreciated. If there is more information needed, let me know.
Thanks in advance
---------------------------------
Ahhh...imagining that irresistible "new car" smell?
Check outnew cars at Yahoo! Autos.