You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Kris Jack (JIRA)" <ji...@apache.org> on 2010/04/09 11:59:50 UTC

[jira] Created: (MAHOUT-372) Partitioning Collaborative Filtering Job into Maps and Reduces

Partitioning Collaborative Filtering Job into Maps and Reduces
--------------------------------------------------------------

                 Key: MAHOUT-372
                 URL: https://issues.apache.org/jira/browse/MAHOUT-372
             Project: Mahout
          Issue Type: Question
          Components: Collaborative Filtering
    Affects Versions: 0.4
         Environment: Ubuntu Koala
            Reporter: Kris Jack


I am running the org.apache.mahout.cf.taste.hadoop.item.RecommenderJob main on my hadoop cluster and it partitions the job in 2 although I have more than 2 nodes available.  I was reading that the partitioning could be changed by setting the JobConf's conf.setNumMapTasks(int num) and conf.setNumReduceTasks(int num).

Would I be right in assuming that this would speed up the processing by increasing these, say to 4)?  Can this code be partitioned into many reducers?  If so, would setting them in the protected AbstractJob::JobConf prepareJobConf() function be appropriate?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAHOUT-372) Partitioning Collaborative Filtering Job into Maps and Reduces

Posted by "Kris Jack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855381#action_12855381 ] 

Kris Jack commented on MAHOUT-372:
----------------------------------

Thanks for your reply.  I'll run it using the command line parameters and hopefully get it working faster.  Thanks for letting me know also about the other mailing list, I'll use that in the future for such questions.

> Partitioning Collaborative Filtering Job into Maps and Reduces
> --------------------------------------------------------------
>
>                 Key: MAHOUT-372
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-372
>             Project: Mahout
>          Issue Type: Question
>          Components: Collaborative Filtering
>    Affects Versions: 0.4
>         Environment: Ubuntu Koala
>            Reporter: Kris Jack
>            Assignee: Sean Owen
>             Fix For: 0.4
>
>
> I am running the org.apache.mahout.cf.taste.hadoop.item.RecommenderJob main on my hadoop cluster and it partitions the job in 2 although I have more than 2 nodes available.  I was reading that the partitioning could be changed by setting the JobConf's conf.setNumMapTasks(int num) and conf.setNumReduceTasks(int num).
> Would I be right in assuming that this would speed up the processing by increasing these, say to 4)?  Can this code be partitioned into many reducers?  If so, would setting them in the protected AbstractJob::JobConf prepareJobConf() function be appropriate?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (MAHOUT-372) Partitioning Collaborative Filtering Job into Maps and Reduces

Posted by "Sean Owen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen resolved MAHOUT-372.
------------------------------

       Resolution: Fixed
    Fix Version/s: 0.4
         Assignee: Sean Owen

Yes, sure there's no particular limit to the number of mappers or reducers. 

These are Hadoop params, which you can set on the command line with, for example:
-Dmapred.map.tasks=10 -Dmapred.reduce.tasks=10

Reopen if that doesn't quite answer the question. (We can also discuss on mahout-user@apache.org, perhaps, if this isn't necessarily a bug or enhancement request.)

> Partitioning Collaborative Filtering Job into Maps and Reduces
> --------------------------------------------------------------
>
>                 Key: MAHOUT-372
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-372
>             Project: Mahout
>          Issue Type: Question
>          Components: Collaborative Filtering
>    Affects Versions: 0.4
>         Environment: Ubuntu Koala
>            Reporter: Kris Jack
>            Assignee: Sean Owen
>             Fix For: 0.4
>
>
> I am running the org.apache.mahout.cf.taste.hadoop.item.RecommenderJob main on my hadoop cluster and it partitions the job in 2 although I have more than 2 nodes available.  I was reading that the partitioning could be changed by setting the JobConf's conf.setNumMapTasks(int num) and conf.setNumReduceTasks(int num).
> Would I be right in assuming that this would speed up the processing by increasing these, say to 4)?  Can this code be partitioned into many reducers?  If so, would setting them in the protected AbstractJob::JobConf prepareJobConf() function be appropriate?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.