You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Sebastian Schelter (JIRA)" <ji...@apache.org> on 2011/05/19 12:00:47 UTC

[jira] [Created] (MAHOUT-704) Refactor PredictionJob to use MultipleInputs for reduce side joins

Refactor PredictionJob to use MultipleInputs for reduce side joins
------------------------------------------------------------------

                 Key: MAHOUT-704
                 URL: https://issues.apache.org/jira/browse/MAHOUT-704
             Project: Mahout
          Issue Type: Improvement
          Components: Collaborative Filtering
    Affects Versions: 0.6
            Reporter: Sebastian Schelter
            Assignee: Sebastian Schelter


The code in org.apache.mahout.cf.taste.hadoop.als.PredictionJob should be refactored to use import org.apache.hadoop.mapreduce.lib.input.MultipleInputs for the reduce side joins. This should spare us some M/R cycles and greatly simplify the code.

We'd need to add an other prepareJob() method to AbstractJob in order to make this work.

This is a rather cosmetic feature request that can wait till after the 0.5 release.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAHOUT-704) Refactor PredictionJob to use MultipleInputs for reduce side joins

Posted by "Sean Owen (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen updated MAHOUT-704:
-----------------------------

    Fix Version/s:     (was: 1.0)
    
> Refactor PredictionJob to use MultipleInputs for reduce side joins
> ------------------------------------------------------------------
>
>                 Key: MAHOUT-704
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-704
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 0.5, 0.6
>            Reporter: Sebastian Schelter
>            Assignee: Sebastian Schelter
>              Labels: collaborative-filtering, hadoop, mapreduce
>
> The code in org.apache.mahout.cf.taste.hadoop.als.PredictionJob should be refactored to use org.apache.hadoop.mapreduce.lib.input.MultipleInputs for the reduce-side joins. This should spare us some M/R cycles and greatly simplify the code.
> We'd need to add another prepareJob() method to AbstractJob in order to make this work.
> This is a rather cosmetic feature request that can wait till after the 0.5 release.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (MAHOUT-704) Refactor PredictionJob to use MultipleInputs for reduce side joins

Posted by "Sebastian Schelter (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sebastian Schelter resolved MAHOUT-704.
---------------------------------------

    Resolution: Won't Fix

MultipleInputs is not part of the new API in Hadoop 0.20 and we won't upgrade in the near future...

> Refactor PredictionJob to use MultipleInputs for reduce side joins
> ------------------------------------------------------------------
>
>                 Key: MAHOUT-704
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-704
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 0.5, 0.6
>            Reporter: Sebastian Schelter
>            Assignee: Sebastian Schelter
>              Labels: collaborative-filtering, hadoop, mapreduce
>             Fix For: 1.0
>
>
> The code in org.apache.mahout.cf.taste.hadoop.als.PredictionJob should be refactored to use org.apache.hadoop.mapreduce.lib.input.MultipleInputs for the reduce-side joins. This should spare us some M/R cycles and greatly simplify the code.
> We'd need to add another prepareJob() method to AbstractJob in order to make this work.
> This is a rather cosmetic feature request that can wait till after the 0.5 release.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAHOUT-704) Refactor PredictionJob to use MultipleInputs for reduce side joins

Posted by "Sean Owen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen updated MAHOUT-704:
-----------------------------

    Affects Version/s: 0.5
        Fix Version/s: 1.0
               Labels: collaborative-filtering hadoop mapreduce  (was: )

> Refactor PredictionJob to use MultipleInputs for reduce side joins
> ------------------------------------------------------------------
>
>                 Key: MAHOUT-704
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-704
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 0.5, 0.6
>            Reporter: Sebastian Schelter
>            Assignee: Sebastian Schelter
>              Labels: collaborative-filtering, hadoop, mapreduce
>             Fix For: 1.0
>
>
> The code in org.apache.mahout.cf.taste.hadoop.als.PredictionJob should be refactored to use org.apache.hadoop.mapreduce.lib.input.MultipleInputs for the reduce-side joins. This should spare us some M/R cycles and greatly simplify the code.
> We'd need to add another prepareJob() method to AbstractJob in order to make this work.
> This is a rather cosmetic feature request that can wait till after the 0.5 release.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAHOUT-704) Refactor PredictionJob to use MultipleInputs for reduce side joins

Posted by "Sebastian Schelter (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037455#comment-13037455 ] 

Sebastian Schelter commented on MAHOUT-704:
-------------------------------------------

I rechecked and saw that in our currently used hadoop version, org.apache.hadoop.mapreduce.lib.input.MultipleInputs does not yet exist, so this ticket is only valid if we choose to upgrade for 0.6. 

> Refactor PredictionJob to use MultipleInputs for reduce side joins
> ------------------------------------------------------------------
>
>                 Key: MAHOUT-704
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-704
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 0.6
>            Reporter: Sebastian Schelter
>            Assignee: Sebastian Schelter
>
> The code in org.apache.mahout.cf.taste.hadoop.als.PredictionJob should be refactored to use org.apache.hadoop.mapreduce.lib.input.MultipleInputs for the reduce-side joins. This should spare us some M/R cycles and greatly simplify the code.
> We'd need to add another prepareJob() method to AbstractJob in order to make this work.
> This is a rather cosmetic feature request that can wait till after the 0.5 release.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAHOUT-704) Refactor PredictionJob to use MultipleInputs for reduce side joins

Posted by "Sebastian Schelter (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sebastian Schelter updated MAHOUT-704:
--------------------------------------

    Description: 
The code in org.apache.mahout.cf.taste.hadoop.als.PredictionJob should be refactored to use org.apache.hadoop.mapreduce.lib.input.MultipleInputs for the reduce-side joins. This should spare us some M/R cycles and greatly simplify the code.

We'd need to add another prepareJob() method to AbstractJob in order to make this work.

This is a rather cosmetic feature request that can wait till after the 0.5 release.

  was:
The code in org.apache.mahout.cf.taste.hadoop.als.PredictionJob should be refactored to use import org.apache.hadoop.mapreduce.lib.input.MultipleInputs for the reduce side joins. This should spare us some M/R cycles and greatly simplify the code.

We'd need to add an other prepareJob() method to AbstractJob in order to make this work.

This is a rather cosmetic feature request that can wait till after the 0.5 release.


> Refactor PredictionJob to use MultipleInputs for reduce side joins
> ------------------------------------------------------------------
>
>                 Key: MAHOUT-704
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-704
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 0.6
>            Reporter: Sebastian Schelter
>            Assignee: Sebastian Schelter
>
> The code in org.apache.mahout.cf.taste.hadoop.als.PredictionJob should be refactored to use org.apache.hadoop.mapreduce.lib.input.MultipleInputs for the reduce-side joins. This should spare us some M/R cycles and greatly simplify the code.
> We'd need to add another prepareJob() method to AbstractJob in order to make this work.
> This is a rather cosmetic feature request that can wait till after the 0.5 release.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira