You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by "Sengupta, Sohini IN BLR SISL" <so...@siemens.com> on 2011/06/22 13:45:09 UTC

meanshift reduce task problem

Hi,

I have programmatically specified setNumReduceTasks(16) in MeanShiftCanopyDriver.java. On execution the number of reducers is being set correctly (i.e. 16 as visible on jobtracker screen)  but on digging deeper I see that one node has maximum number of bytes to process and it is nominal for rest of the nodes. Hence the reduce phase is very slow after 98% completion.

I am trying this on a cluster of 18 nodes. I also see that load is distributed evenly in map phase but not in reduce. This is happening on 0.4 and 0.5 versions of Mahout. Has anyone faced such a problem and how to get around it?
Thanks a lot in advance,
Sohini

________________________________
Important notice: This e-mail and any attachment there to contains corporate proprietary information. If you have received it by mistake, please notify us immediately by reply e-mail and delete this e-mail and its attachments from your system.
Thank You.