You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by "Robin M. E. Swezey" <ro...@swezey.fr> on 2010/12/10 11:55:28 UTC

Queueing several Mahout classifier training jobs

Hello,

I am currently using Mahout to train and test several CBayes
classifiers in 10-folds.

I launch all jobs for a 10-fold simultaneously using the mahout
trainclassifier|testclassifier command in a script. However, when
looking at the Hadoop job journal after having queued the 10 jobs, it
appears that all the Bayes Feature Driver jobs, upon completion, exit
without queueing their respective subsequent jobs (TF-Idf Driver,
Bayes Weight Summer Driver, Complementary Bayes Theta Normalizer).

This does not happen if I queue only 4 Mahout CBayes classifier
training jobs in Hadoop, the subsequent TF-Idf jobs etc are queued
fine.

Do you have any idea of the cause of this behavior? I would really
like to be able to queue as many Mahout training jobs as possible.

Best regards

-- 
Robin M. E. Swezey
--
Web/AI PhD Candidate
--
+81 (0) 90 1785 1337
robin@swezey.fr
http://www-toralab.ics.nitech.ac.jp/

Re: Queueing several Mahout classifier training jobs

Posted by Robin Anil <ro...@gmail.com>.
Do these M/Rs have any conflicting temp, input, output directories. Are
there any errors being thrown. ?


On Fri, Dec 10, 2010 at 4:25 PM, Robin M. E. Swezey <ro...@swezey.fr> wrote:

> Hello,
>
> I am currently using Mahout to train and test several CBayes
> classifiers in 10-folds.
>
> I launch all jobs for a 10-fold simultaneously using the mahout
> trainclassifier|testclassifier command in a script. However, when
> looking at the Hadoop job journal after having queued the 10 jobs, it
> appears that all the Bayes Feature Driver jobs, upon completion, exit
> without queueing their respective subsequent jobs (TF-Idf Driver,
> Bayes Weight Summer Driver, Complementary Bayes Theta Normalizer).
>
> This does not happen if I queue only 4 Mahout CBayes classifier
> training jobs in Hadoop, the subsequent TF-Idf jobs etc are queued
> fine.
>
> Do you have any idea of the cause of this behavior? I would really
> like to be able to queue as many Mahout training jobs as possible.
>
> Best regards
>
> --
> Robin M. E. Swezey
> --
> Web/AI PhD Candidate
> --
> +81 (0) 90 1785 1337
> robin@swezey.fr
> http://www-toralab.ics.nitech.ac.jp/
>

Re: Re: Queueing several Mahout classifier training jobs

Posted by "Robin M. E. Swezey" <ro...@swezey.fr>.
Hello,

I am sorry, it was a mistake on my part. This actually happened
because the SSH session to the master node timeouted, resulting in the
drivers launched by the series of mahout commands to be shut down in
turn, thus preventing the subsequent jobs to be queued.

Remedy is to launch the series of mahout commands with nohup so that
the control drivers can run independently and queue all the necessary
jobs no matter what.

Thanks for your response and sorry for the inconvenience.

Best regards

> Do these M/Rs have any conflicting temp, input, output directories. Are
> there any errors being thrown. ?
>
>
> On Fri, Dec 10, 2010 at 4:25 PM, Robin M. E. Swezey <ro...@swezey.fr> wrote:
>
>> Hello,
>>
>> I am currently using Mahout to train and test several CBayes
>> classifiers in 10-folds.
>>
>> I launch all jobs for a 10-fold simultaneously using the mahout
>> trainclassifier|testclassifier command in a script. However, when
>> looking at the Hadoop job journal after having queued the 10 jobs, it
>> appears that all the Bayes Feature Driver jobs, upon completion, exit
>> without queueing their respective subsequent jobs (TF-Idf Driver,
>> Bayes Weight Summer Driver, Complementary Bayes Theta Normalizer).
>>
>> This does not happen if I queue only 4 Mahout CBayes classifier
>> training jobs in Hadoop, the subsequent TF-Idf jobs etc are queued
>> fine.
>>
>> Do you have any idea of the cause of this behavior? I would really
>> like to be able to queue as many Mahout training jobs as possible.
>>
>> Best regards
>>
>> --
>> Robin M. E. Swezey
>> --
>> Web/AI PhD Candidate
>> --
>> +81 (0) 90 1785 1337
>> robin@swezey.fr
>> http://www-toralab.ics.nitech.ac.jp/
>>
>
>

-- 
Robin M. E. Swezey
--
Web/AI PhD Candidate
--
+81 (0) 90 1785 1337
robin@swezey.fr
http://www-toralab.ics.nitech.ac.jp/