You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@mahout.apache.org by "Jeff Eastman (JIRA)" <ji...@apache.org> on 2010/06/13 20:08:13 UTC

[jira] Created: (MAHOUT-414) Usability: Mahout applications need a consistent API to allow users to specify desired map/reduce concurrency

Usability: Mahout applications need a consistent API to allow users to specify desired map/reduce concurrency
-------------------------------------------------------------------------------------------------------------

                 Key: MAHOUT-414
                 URL: https://issues.apache.org/jira/browse/MAHOUT-414
             Project: Mahout
          Issue Type: Bug
    Affects Versions: 0.3
            Reporter: Jeff Eastman
             Fix For: 0.4


If specifying the number of mappers and reducers is a common activity which users need to perform in running Mahout applications on Hadoop clusters then we need to have a standard way of specifying them in our APIs without exposing the full set of Hadoop options, especially for our non-power-users. This is the case for some applications already but others require the use of Hadoop-level -D arguments to achieve reasonable out-of-the-box parallelism even when running our examples. The usability defect is that some of our algorithms won't scale without it and that we don't have a standard way to express this in our APIs. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAHOUT-414) Usability: Mahout applications need a consistent API to allow users to specify desired map/reduce concurrency

Posted by "Sean Owen (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAHOUT-414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916490#action_12916490 ] 

Sean Owen commented on MAHOUT-414:
----------------------------------

Jeff would you regard this as complete? I'm the only other one that commented here and am personally pleased to let you finish this out as you like.

> Usability: Mahout applications need a consistent API to allow users to specify desired map/reduce concurrency
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-414
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-414
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.3
>            Reporter: Jeff Eastman
>             Fix For: 0.4
>
>
> If specifying the number of mappers and reducers is a common activity which users need to perform in running Mahout applications on Hadoop clusters then we need to have a standard way of specifying them in our APIs without exposing the full set of Hadoop options, especially for our non-power-users. This is the case for some applications already but others require the use of Hadoop-level -D arguments to achieve reasonable out-of-the-box parallelism even when running our examples. The usability defect is that some of our algorithms won't scale without it and that we don't have a standard way to express this in our APIs. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.