You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Anna Lahoud <an...@gmail.com> on 2012/09/27 17:14:56 UTC

Seq2sparse numReducers is not passed to some jobs

In the complex sequence of jobs that are run in seq2sparse, the number of
reducers parameter is only given to some. Is there any reason why we
couldn't pass that along to the jobs that are not getting it: (1)
wordcount, (2) df counting, and (3) tfidf make partial vectors? I am trying
to make some improvement to the running time of my job and the tfidf
make-partial jobs, in particular, consume a big part of the overall running
time, but they only get the default single reducer.

Anna

Re: Seq2sparse numReducers is not passed to some jobs

Posted by Abbas Gadhia <ab...@yahoo.com>.
I have the exact same problem. Some of the longer running sub-tasks take
only 1 reducer and each reduce task runs between 2-3 hours !!!