You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "issei yoshida (Created) (JIRA)" <ji...@apache.org> on 2011/12/07 11:08:40 UTC

[jira] [Created] (MAHOUT-918) Implement SGD based classifiers using MapReduce

Implement SGD based classifiers using MapReduce
-----------------------------------------------

                 Key: MAHOUT-918
                 URL: https://issues.apache.org/jira/browse/MAHOUT-918
             Project: Mahout
          Issue Type: New Feature
          Components: Classification
    Affects Versions: 0.6
            Reporter: issei yoshida


Implement SGD based classifiers (Logistic Regression, Adaptive Logistic regression and Passive-Aggressive) using MapReduce.
They are implemented using Iterative Parameter Mixtures algorithm which is referred to in the following papers.

http://research.google.com/pubs/pub36948.html
http://aclweb.org/anthology-new/N/N10/N10-1069.pdf
http://books.nips.cc/papers/files/nips22/NIPS2009_0345.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Re: [jira] [Commented] (MAHOUT-918) Implement SGD based classifiers using MapReduce

Posted by Lance Norskog <go...@gmail.com>.
Suggestion: enhance examples/bin/classify-20newsgroups.sh to allow
using this to generate the model, along with the online program.

Lance

On Mon, Dec 12, 2011 at 4:06 AM, jiraposter@reviews.apache.org
(Commented) (JIRA) <ji...@apache.org> wrote:
>
>    [ https://issues.apache.org/jira/browse/MAHOUT-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13167462#comment-13167462 ]
>
> jiraposter@reviews.apache.org commented on MAHOUT-918:
> ------------------------------------------------------
>
>
>
> bq.  On 2011-12-08 07:04:49, Ted Dunning wrote:
> bq.  > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java, line 36
> bq.  > <https://reviews.apache.org/r/3072/diff/2/?file=63195#file63195line36>
> bq.  >
> bq.  >     Needs a comment about how this works.
>
> Added comments.
>
>
> bq.  On 2011-12-08 07:04:49, Ted Dunning wrote:
> bq.  > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java, lines 67-75
> bq.  > <https://reviews.apache.org/r/3072/diff/2/?file=63195#file63195line67>
> bq.  >
> bq.  >     This really need a comment.  What is the purpose here?
>
> Added comments.
>
>
> bq.  On 2011-12-08 07:04:49, Ted Dunning wrote:
> bq.  > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java, lines 98-111
> bq.  > <https://reviews.apache.org/r/3072/diff/2/?file=63195#file63195line98>
> bq.  >
> bq.  >     What is this intended to do?  Why?
>
> Added comments.
>
>
> bq.  On 2011-12-08 07:04:49, Ted Dunning wrote:
> bq.  > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java, line 30
> bq.  > <https://reviews.apache.org/r/3072/diff/2/?file=63196#file63196line30>
> bq.  >
> bq.  >     Typo.
> bq.  >
> bq.  >     Also, this doesn't say how this works or why it is the way it is.
>
> Fixed the typo and added comments.
>
>
> bq.  On 2011-12-08 07:04:49, Ted Dunning wrote:
> bq.  > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java, line 32
> bq.  > <https://reviews.apache.org/r/3072/diff/2/?file=63196#file63196line32>
> bq.  >
> bq.  >     Shouldn't there be a combiner as well?
>
> A combiner isn't needed because each map task submits one value overall.
>
>
> bq.  On 2011-12-08 07:04:49, Ted Dunning wrote:
> bq.  > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java, line 53
> bq.  > <https://reviews.apache.org/r/3072/diff/2/?file=63196#file63196line53>
> bq.  >
> bq.  >     A comment here about what this weight is would be nice.  Also, how can a double be a key?  That is tantamount to comparing doubles which is bad.
>
> Added comments. it is not the weight of the classifier but the weight of the weighted average.
>
>
> bq.  On 2011-12-08 07:04:49, Ted Dunning wrote:
> bq.  > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java, line 99
> bq.  > <https://reviews.apache.org/r/3072/diff/2/?file=63197#file63197line99>
> bq.  >
> bq.  >     Where does the InterruptedException come from?
>
> It comes from runIteration function.
>
>
> bq.  On 2011-12-08 07:04:49, Ted Dunning wrote:
> bq.  > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java, lines 110-111
> bq.  > <https://reviews.apache.org/r/3072/diff/2/?file=63197#file63197line110>
> bq.  >
> bq.  >     Use brackets
>
> Added brackets.
>
>
> bq.  On 2011-12-08 07:04:49, Ted Dunning wrote:
> bq.  > trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java, line 35
> bq.  > <https://reviews.apache.org/r/3072/diff/2/?file=63198#file63198line35>
> bq.  >
> bq.  >     Should not throw Exception
>
> Added IO Exception and Interrupted Exception.
>
>
> bq.  On 2011-12-08 07:04:49, Ted Dunning wrote:
> bq.  > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java, lines 53-56
> bq.  > <https://reviews.apache.org/r/3072/diff/2/?file=63195#file63195line53>
> bq.  >
> bq.  >     This is nearly duplicated code.  The mapper and reducer should share some code to avoid inconsistent defaults.
>
> Created a base class which shares the same initialization code.
>
>
> - issei
>
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/3072/#review3734
> -----------------------------------------------------------
>
>
> On 2011-12-12 11:51:59, issei yoshida wrote:
> bq.
> bq.  -----------------------------------------------------------
> bq.  This is an automatically generated e-mail. To reply, visit:
> bq.  https://reviews.apache.org/r/3072/
> bq.  -----------------------------------------------------------
> bq.
> bq.  (Updated 2011-12-12 11:51:59)
> bq.
> bq.
> bq.  Review request for mahout.
> bq.
> bq.
> bq.  Summary
> bq.  -------
> bq.
> bq.  MAHOUT-918 Parallelized SGD in MapReduce
> bq.
> bq.
> bq.  This addresses bug MAHOUT-918.
> bq.      https://issues.apache.org/jira/browse/MAHOUT-918
> bq.
> bq.
> bq.  Diffs
> bq.  -----
> bq.
> bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1213193
> bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION
> bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION
> bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION
> bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION
> bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION
> bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION
> bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION
> bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION
> bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION
> bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION
> bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java PRE-CREATION
> bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java PRE-CREATION
> bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION
> bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION
> bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION
> bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION
> bq.
> bq.  Diff: https://reviews.apache.org/r/3072/diff
> bq.
> bq.
> bq.  Testing
> bq.  -------
> bq.
> bq.
> bq.  Thanks,
> bq.
> bq.  issei
> bq.
> bq.
>
>
>
>> Implement SGD based classifiers using MapReduce
>> -----------------------------------------------
>>
>>                 Key: MAHOUT-918
>>                 URL: https://issues.apache.org/jira/browse/MAHOUT-918
>>             Project: Mahout
>>          Issue Type: New Feature
>>          Components: Classification
>>    Affects Versions: 0.6
>>            Reporter: issei yoshida
>>         Attachments: MAHOUT-918.patch, design.pdf
>>
>>
>> Implement SGD based classifiers (Logistic Regression, Adaptive Logistic regression and Passive-Aggressive) using MapReduce.
>> They are implemented using Iterative Parameter Mixtures algorithm which is referred to in the following papers.
>> http://research.google.com/pubs/pub36948.html
>> http://aclweb.org/anthology-new/N/N10/N10-1069.pdf
>> http://books.nips.cc/papers/files/nips22/NIPS2009_0345.pdf
>
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>
>



-- 
Lance Norskog
goksron@gmail.com

[jira] [Commented] (MAHOUT-918) Implement SGD based classifiers using MapReduce

Posted by "issei yoshida (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164330#comment-13164330 ] 

issei yoshida commented on MAHOUT-918:
--------------------------------------

I wrote the code distributing Logistic Regression, Adaptive Logistic regression and Passive-Aggressive with MapReduce.
I would like your comments.
                
> Implement SGD based classifiers using MapReduce
> -----------------------------------------------
>
>                 Key: MAHOUT-918
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-918
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Classification
>    Affects Versions: 0.6
>            Reporter: issei yoshida
>         Attachments: MAHOUT-918.patch
>
>
> Implement SGD based classifiers (Logistic Regression, Adaptive Logistic regression and Passive-Aggressive) using MapReduce.
> They are implemented using Iterative Parameter Mixtures algorithm which is referred to in the following papers.
> http://research.google.com/pubs/pub36948.html
> http://aclweb.org/anthology-new/N/N10/N10-1069.pdf
> http://books.nips.cc/papers/files/nips22/NIPS2009_0345.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-918) Implement SGD based classifiers using MapReduce

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13168091#comment-13168091 ] 

jiraposter@reviews.apache.org commented on MAHOUT-918:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3072/#review3869
-----------------------------------------------------------



trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java
<https://reviews.apache.org/r/3072/#comment8694>

    This is a useless comment.  The name says the same thing.  Just putting in comments like this to satisfy a request for comments is very frustrating behavior.
    
    WHAT DOES THIS CODE INTEND TO DO AND WHY?



trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java
<https://reviews.apache.org/r/3072/#comment8695>

    Same comment.  This is not a comment.  This is a repetition.  It adds nothing and shouldn't be here.



trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java
<https://reviews.apache.org/r/3072/#comment8696>

    HOW AND WHY?



trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java
<https://reviews.apache.org/r/3072/#comment8697>

    How does the updat ework?
    
    How is the request for using the final iteration as initial weights made?
    
    Why does it work this way?



trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java
<https://reviews.apache.org/r/3072/#comment8698>

    Iterations of what?



trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java
<https://reviews.apache.org/r/3072/#comment8699>

    This is another non-comment.


- Ted


On 2011-12-12 11:51:59, issei yoshida wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/3072/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-12-12 11:51:59)
bq.  
bq.  
bq.  Review request for mahout.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  MAHOUT-918 Parallelized SGD in MapReduce
bq.  
bq.  
bq.  This addresses bug MAHOUT-918.
bq.      https://issues.apache.org/jira/browse/MAHOUT-918
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1213193 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java PRE-CREATION 
bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION 
bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION 
bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION 
bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/3072/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  issei
bq.  
bq.


                
> Implement SGD based classifiers using MapReduce
> -----------------------------------------------
>
>                 Key: MAHOUT-918
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-918
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Classification
>    Affects Versions: 0.6
>            Reporter: issei yoshida
>         Attachments: MAHOUT-918.patch, design.pdf
>
>
> Implement SGD based classifiers (Logistic Regression, Adaptive Logistic regression and Passive-Aggressive) using MapReduce.
> They are implemented using Iterative Parameter Mixtures algorithm which is referred to in the following papers.
> http://research.google.com/pubs/pub36948.html
> http://aclweb.org/anthology-new/N/N10/N10-1069.pdf
> http://books.nips.cc/papers/files/nips22/NIPS2009_0345.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-918) Implement SGD based classifiers using MapReduce

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13168118#comment-13168118 ] 

jiraposter@reviews.apache.org commented on MAHOUT-918:
------------------------------------------------------



bq.  On 2011-12-08 07:04:49, Ted Dunning wrote:
bq.  >
bq.  
bq.  Ted Dunning wrote:
bq.      This code got worse with these comments, not better.

Would you mind reviewing Diff revision 3?
You still seems to look at revision 2.


- issei


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3072/#review3734
-----------------------------------------------------------


On 2011-12-12 11:51:59, issei yoshida wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/3072/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-12-12 11:51:59)
bq.  
bq.  
bq.  Review request for mahout.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  MAHOUT-918 Parallelized SGD in MapReduce
bq.  
bq.  
bq.  This addresses bug MAHOUT-918.
bq.      https://issues.apache.org/jira/browse/MAHOUT-918
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1213193 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java PRE-CREATION 
bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION 
bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION 
bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION 
bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/3072/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  issei
bq.  
bq.


                
> Implement SGD based classifiers using MapReduce
> -----------------------------------------------
>
>                 Key: MAHOUT-918
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-918
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Classification
>    Affects Versions: 0.6
>            Reporter: issei yoshida
>         Attachments: MAHOUT-918.patch, design.pdf
>
>
> Implement SGD based classifiers (Logistic Regression, Adaptive Logistic regression and Passive-Aggressive) using MapReduce.
> They are implemented using Iterative Parameter Mixtures algorithm which is referred to in the following papers.
> http://research.google.com/pubs/pub36948.html
> http://aclweb.org/anthology-new/N/N10/N10-1069.pdf
> http://books.nips.cc/papers/files/nips22/NIPS2009_0345.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-918) Implement SGD based classifiers using MapReduce

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13168193#comment-13168193 ] 

jiraposter@reviews.apache.org commented on MAHOUT-918:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3072/
-----------------------------------------------------------

(Updated 2011-12-13 07:32:38.895973)


Review request for mahout.


Summary
-------

MAHOUT-918 Parallelized SGD in MapReduce


This addresses bug MAHOUT-918.
    https://issues.apache.org/jira/browse/MAHOUT-918


Diffs (updated)
-----

  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1213193 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java PRE-CREATION 
  trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION 
  trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION 
  trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION 
  trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION 

Diff: https://reviews.apache.org/r/3072/diff


Testing
-------


Thanks,

issei


                
> Implement SGD based classifiers using MapReduce
> -----------------------------------------------
>
>                 Key: MAHOUT-918
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-918
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Classification
>    Affects Versions: 0.6
>            Reporter: issei yoshida
>         Attachments: MAHOUT-918.patch, design.pdf
>
>
> Implement SGD based classifiers (Logistic Regression, Adaptive Logistic regression and Passive-Aggressive) using MapReduce.
> They are implemented using Iterative Parameter Mixtures algorithm which is referred to in the following papers.
> http://research.google.com/pubs/pub36948.html
> http://aclweb.org/anthology-new/N/N10/N10-1069.pdf
> http://books.nips.cc/papers/files/nips22/NIPS2009_0345.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAHOUT-918) Implement SGD based classifiers using MapReduce

Posted by "issei yoshida (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

issei yoshida updated MAHOUT-918:
---------------------------------

    Status: Patch Available  (was: Open)
    
> Implement SGD based classifiers using MapReduce
> -----------------------------------------------
>
>                 Key: MAHOUT-918
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-918
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Classification
>    Affects Versions: 0.6
>            Reporter: issei yoshida
>
> Implement SGD based classifiers (Logistic Regression, Adaptive Logistic regression and Passive-Aggressive) using MapReduce.
> They are implemented using Iterative Parameter Mixtures algorithm which is referred to in the following papers.
> http://research.google.com/pubs/pub36948.html
> http://aclweb.org/anthology-new/N/N10/N10-1069.pdf
> http://books.nips.cc/papers/files/nips22/NIPS2009_0345.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAHOUT-918) Implement SGD based classifiers using MapReduce

Posted by "issei yoshida (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

issei yoshida updated MAHOUT-918:
---------------------------------

    Attachment: design.pdf
    
> Implement SGD based classifiers using MapReduce
> -----------------------------------------------
>
>                 Key: MAHOUT-918
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-918
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Classification
>    Affects Versions: 0.6
>            Reporter: issei yoshida
>         Attachments: MAHOUT-918.patch, design.pdf
>
>
> Implement SGD based classifiers (Logistic Regression, Adaptive Logistic regression and Passive-Aggressive) using MapReduce.
> They are implemented using Iterative Parameter Mixtures algorithm which is referred to in the following papers.
> http://research.google.com/pubs/pub36948.html
> http://aclweb.org/anthology-new/N/N10/N10-1069.pdf
> http://books.nips.cc/papers/files/nips22/NIPS2009_0345.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAHOUT-918) Implement SGD based classifiers using MapReduce

Posted by "issei yoshida (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

issei yoshida updated MAHOUT-918:
---------------------------------

    Status: Open  (was: Patch Available)
    
> Implement SGD based classifiers using MapReduce
> -----------------------------------------------
>
>                 Key: MAHOUT-918
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-918
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Classification
>    Affects Versions: 0.6
>            Reporter: issei yoshida
>
> Implement SGD based classifiers (Logistic Regression, Adaptive Logistic regression and Passive-Aggressive) using MapReduce.
> They are implemented using Iterative Parameter Mixtures algorithm which is referred to in the following papers.
> http://research.google.com/pubs/pub36948.html
> http://aclweb.org/anthology-new/N/N10/N10-1069.pdf
> http://books.nips.cc/papers/files/nips22/NIPS2009_0345.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-918) Implement SGD based classifiers using MapReduce

Posted by "Ted Dunning (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164567#comment-13164567 ] 

Ted Dunning commented on MAHOUT-918:
------------------------------------

Algorithmically, simply gluing several classifiers into a map-reduce framework doesn't really change much.

Much more interesting would be to do something along these lines:

http://arxiv.org/pdf/1107.2490

or this:

http://cacm.acm.org/blogs/blog-cacm/144075-hadoop-allreduce-and-terascale-learning/fulltext


                
> Implement SGD based classifiers using MapReduce
> -----------------------------------------------
>
>                 Key: MAHOUT-918
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-918
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Classification
>    Affects Versions: 0.6
>            Reporter: issei yoshida
>         Attachments: MAHOUT-918.patch
>
>
> Implement SGD based classifiers (Logistic Regression, Adaptive Logistic regression and Passive-Aggressive) using MapReduce.
> They are implemented using Iterative Parameter Mixtures algorithm which is referred to in the following papers.
> http://research.google.com/pubs/pub36948.html
> http://aclweb.org/anthology-new/N/N10/N10-1069.pdf
> http://books.nips.cc/papers/files/nips22/NIPS2009_0345.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-918) Implement SGD based classifiers using MapReduce

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13169213#comment-13169213 ] 

jiraposter@reviews.apache.org commented on MAHOUT-918:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3072/
-----------------------------------------------------------

(Updated 2011-12-14 08:59:29.074032)


Review request for mahout.


Summary
-------

MAHOUT-918 Parallelized SGD in MapReduce


This addresses bug MAHOUT-918.
    https://issues.apache.org/jira/browse/MAHOUT-918


Diffs (updated)
-----

  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1214116 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java PRE-CREATION 
  trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION 
  trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION 
  trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION 
  trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION 

Diff: https://reviews.apache.org/r/3072/diff


Testing
-------


Thanks,

issei


                
> Implement SGD based classifiers using MapReduce
> -----------------------------------------------
>
>                 Key: MAHOUT-918
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-918
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Classification
>    Affects Versions: 0.6
>            Reporter: issei yoshida
>         Attachments: MAHOUT-918.patch, design.pdf
>
>
> Implement SGD based classifiers (Logistic Regression, Adaptive Logistic regression and Passive-Aggressive) using MapReduce.
> They are implemented using Iterative Parameter Mixtures algorithm which is referred to in the following papers.
> http://research.google.com/pubs/pub36948.html
> http://aclweb.org/anthology-new/N/N10/N10-1069.pdf
> http://books.nips.cc/papers/files/nips22/NIPS2009_0345.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-918) Implement SGD based classifiers using MapReduce

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13169223#comment-13169223 ] 

jiraposter@reviews.apache.org commented on MAHOUT-918:
------------------------------------------------------



bq.  On 2011-12-13 13:24:28, Ted Dunning wrote:
bq.  > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java, lines 36-41
bq.  > <https://reviews.apache.org/r/3072/diff/4/?file=64283#file64283line36>
bq.  >
bq.  >     Direct and exact quotes from the paper should be either avoided or acknowledged.  Better here to rephrase the language.

Rephrased the language at revision 5.


bq.  On 2011-12-13 13:24:28, Ted Dunning wrote:
bq.  > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java, lines 60-63
bq.  > <https://reviews.apache.org/r/3072/diff/4/?file=64283#file64283line60>
bq.  >
bq.  >     Again, just quoting the paper is not a good idea.  This isn't adding any information in any case since the exact same language was used in the class level java doc.
bq.  >     
bq.  >     It would be nice here to note that the average is an *unweighted* average.

Rephrased the language at revision 5.


bq.  On 2011-12-13 13:24:28, Ted Dunning wrote:
bq.  > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java, lines 87-88
bq.  > <https://reviews.apache.org/r/3072/diff/4/?file=64284#file64284line87>
bq.  >
bq.  >     This looks like a bad key to use here.

This key should be the average of log-likelihood of the best OnlineLogisticRegression in AdaptiveLogisticRegression.


bq.  On 2011-12-13 13:24:28, Ted Dunning wrote:
bq.  > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java, line 40
bq.  > <https://reviews.apache.org/r/3072/diff/4/?file=64284#file64284line40>
bq.  >
bq.  >     I don't think that this is correct.  Is this really what the output is?  Why are you dividing by a weight vector?  How do you compute this score?
bq.  >     
bq.  >     Or do you mean to not divide here?
bq.  >     
bq.  >     If so, why do you use a score as the key?

The way to explain it may be bad, but it means the Map output key is score and Map output value is new weight vector.
I rewrote the comment at revision 5.


bq.  On 2011-12-13 13:24:28, Ted Dunning wrote:
bq.  > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java, lines 34-35
bq.  > <https://reviews.apache.org/r/3072/diff/4/?file=64285#file64285line34>
bq.  >
bq.  >     I don't think that this is correct.  In the google paper, the average was unweighted.  In any case how do you compute this score for weighting?
bq.  >     
bq.  >     Also, if the key is the score, how does the reducer work since each reduce function will only see one score?  Are you assuming that there is exactly one reducer?

The original paper(http://aclweb.org/anthology-new/N/N10/N10-1069.pdf) says it is a weighted average,
but my simple experiment showed that the unweighted average was better than the weighted average.
So I rewrote the code as the unweighted average at revision 5.

The number of reducers should be set to one. I added the comment accordingly at revision 5.
The number of reducers is set at runIteration function at Driver class.


- issei


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3072/#review3875
-----------------------------------------------------------


On 2011-12-14 08:59:29, issei yoshida wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/3072/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-12-14 08:59:29)
bq.  
bq.  
bq.  Review request for mahout.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  MAHOUT-918 Parallelized SGD in MapReduce
bq.  
bq.  
bq.  This addresses bug MAHOUT-918.
bq.      https://issues.apache.org/jira/browse/MAHOUT-918
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1214116 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java PRE-CREATION 
bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION 
bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION 
bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION 
bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/3072/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  issei
bq.  
bq.


                
> Implement SGD based classifiers using MapReduce
> -----------------------------------------------
>
>                 Key: MAHOUT-918
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-918
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Classification
>    Affects Versions: 0.6
>            Reporter: issei yoshida
>         Attachments: MAHOUT-918.patch, design.pdf
>
>
> Implement SGD based classifiers (Logistic Regression, Adaptive Logistic regression and Passive-Aggressive) using MapReduce.
> They are implemented using Iterative Parameter Mixtures algorithm which is referred to in the following papers.
> http://research.google.com/pubs/pub36948.html
> http://aclweb.org/anthology-new/N/N10/N10-1069.pdf
> http://books.nips.cc/papers/files/nips22/NIPS2009_0345.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-918) Implement SGD based classifiers using MapReduce

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13168194#comment-13168194 ] 

jiraposter@reviews.apache.org commented on MAHOUT-918:
------------------------------------------------------



bq.  On 2011-12-08 07:04:49, Ted Dunning wrote:
bq.  >
bq.  
bq.  Ted Dunning wrote:
bq.      This code got worse with these comments, not better.
bq.  
bq.  issei yoshida wrote:
bq.      Would you mind reviewing Diff revision 3?
bq.      You still seems to look at revision 2.

Updated Diff revision 4 where I add some comments,
so please see revision 4.
https://reviews.apache.org/r/3072/diff/


- issei


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3072/#review3734
-----------------------------------------------------------


On 2011-12-13 07:32:38, issei yoshida wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/3072/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-12-13 07:32:38)
bq.  
bq.  
bq.  Review request for mahout.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  MAHOUT-918 Parallelized SGD in MapReduce
bq.  
bq.  
bq.  This addresses bug MAHOUT-918.
bq.      https://issues.apache.org/jira/browse/MAHOUT-918
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1213193 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java PRE-CREATION 
bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION 
bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION 
bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION 
bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/3072/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  issei
bq.  
bq.


                
> Implement SGD based classifiers using MapReduce
> -----------------------------------------------
>
>                 Key: MAHOUT-918
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-918
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Classification
>    Affects Versions: 0.6
>            Reporter: issei yoshida
>         Attachments: MAHOUT-918.patch, design.pdf
>
>
> Implement SGD based classifiers (Logistic Regression, Adaptive Logistic regression and Passive-Aggressive) using MapReduce.
> They are implemented using Iterative Parameter Mixtures algorithm which is referred to in the following papers.
> http://research.google.com/pubs/pub36948.html
> http://aclweb.org/anthology-new/N/N10/N10-1069.pdf
> http://books.nips.cc/papers/files/nips22/NIPS2009_0345.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAHOUT-918) Implement SGD based classifiers using MapReduce

Posted by "issei yoshida (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

issei yoshida updated MAHOUT-918:
---------------------------------

    Attachment: MAHOUT-918.patch
    
> Implement SGD based classifiers using MapReduce
> -----------------------------------------------
>
>                 Key: MAHOUT-918
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-918
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Classification
>    Affects Versions: 0.6
>            Reporter: issei yoshida
>         Attachments: MAHOUT-918.patch
>
>
> Implement SGD based classifiers (Logistic Regression, Adaptive Logistic regression and Passive-Aggressive) using MapReduce.
> They are implemented using Iterative Parameter Mixtures algorithm which is referred to in the following papers.
> http://research.google.com/pubs/pub36948.html
> http://aclweb.org/anthology-new/N/N10/N10-1069.pdf
> http://books.nips.cc/papers/files/nips22/NIPS2009_0345.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-918) Implement SGD based classifiers using MapReduce

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13168370#comment-13168370 ] 

jiraposter@reviews.apache.org commented on MAHOUT-918:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3072/#review3875
-----------------------------------------------------------



trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java
<https://reviews.apache.org/r/3072/#comment8703>

    Direct and exact quotes from the paper should be either avoided or acknowledged.  Better here to rephrase the language.



trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java
<https://reviews.apache.org/r/3072/#comment8704>

    Again, just quoting the paper is not a good idea.  This isn't adding any information in any case since the exact same language was used in the class level java doc.
    
    It would be nice here to note that the average is an *unweighted* average.



trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java
<https://reviews.apache.org/r/3072/#comment8705>

    I don't think that this is correct.  Is this really what the output is?  Why are you dividing by a weight vector?  How do you compute this score?
    
    Or do you mean to not divide here?
    
    If so, why do you use a score as the key?



trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java
<https://reviews.apache.org/r/3072/#comment8706>

    This looks like a bad key to use here.



trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java
<https://reviews.apache.org/r/3072/#comment8707>

    I don't think that this is correct.  In the google paper, the average was unweighted.  In any case how do you compute this score for weighting?
    
    Also, if the key is the score, how does the reducer work since each reduce function will only see one score?  Are you assuming that there is exactly one reducer?


- Ted


On 2011-12-13 07:32:38, issei yoshida wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/3072/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-12-13 07:32:38)
bq.  
bq.  
bq.  Review request for mahout.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  MAHOUT-918 Parallelized SGD in MapReduce
bq.  
bq.  
bq.  This addresses bug MAHOUT-918.
bq.      https://issues.apache.org/jira/browse/MAHOUT-918
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1213193 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java PRE-CREATION 
bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION 
bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION 
bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION 
bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/3072/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  issei
bq.  
bq.


                
> Implement SGD based classifiers using MapReduce
> -----------------------------------------------
>
>                 Key: MAHOUT-918
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-918
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Classification
>    Affects Versions: 0.6
>            Reporter: issei yoshida
>         Attachments: MAHOUT-918.patch, design.pdf
>
>
> Implement SGD based classifiers (Logistic Regression, Adaptive Logistic regression and Passive-Aggressive) using MapReduce.
> They are implemented using Iterative Parameter Mixtures algorithm which is referred to in the following papers.
> http://research.google.com/pubs/pub36948.html
> http://aclweb.org/anthology-new/N/N10/N10-1069.pdf
> http://books.nips.cc/papers/files/nips22/NIPS2009_0345.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-918) Implement SGD based classifiers using MapReduce

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165045#comment-13165045 ] 

jiraposter@reviews.apache.org commented on MAHOUT-918:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3072/
-----------------------------------------------------------

Review request for mahout.


Summary
-------

MAHOUT-918 Parallelized SGD in MapReduce


This addresses bug MAHOUT-918.
    https://issues.apache.org/jira/browse/MAHOUT-918


Diffs
-----

  trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION 
  trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION 
  trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION 
  trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1211755 

Diff: https://reviews.apache.org/r/3072/diff


Testing
-------


Thanks,

issei


                
> Implement SGD based classifiers using MapReduce
> -----------------------------------------------
>
>                 Key: MAHOUT-918
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-918
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Classification
>    Affects Versions: 0.6
>            Reporter: issei yoshida
>         Attachments: MAHOUT-918.patch
>
>
> Implement SGD based classifiers (Logistic Regression, Adaptive Logistic regression and Passive-Aggressive) using MapReduce.
> They are implemented using Iterative Parameter Mixtures algorithm which is referred to in the following papers.
> http://research.google.com/pubs/pub36948.html
> http://aclweb.org/anthology-new/N/N10/N10-1069.pdf
> http://books.nips.cc/papers/files/nips22/NIPS2009_0345.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-918) Implement SGD based classifiers using MapReduce

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165047#comment-13165047 ] 

jiraposter@reviews.apache.org commented on MAHOUT-918:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3072/
-----------------------------------------------------------

(Updated 2011-12-08 06:52:01.921057)


Review request for mahout.


Summary
-------

MAHOUT-918 Parallelized SGD in MapReduce


This addresses bug MAHOUT-918.
    https://issues.apache.org/jira/browse/MAHOUT-918


Diffs (updated)
-----

  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1211755 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION 
  trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION 
  trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION 
  trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION 
  trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION 

Diff: https://reviews.apache.org/r/3072/diff


Testing
-------


Thanks,

issei


                
> Implement SGD based classifiers using MapReduce
> -----------------------------------------------
>
>                 Key: MAHOUT-918
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-918
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Classification
>    Affects Versions: 0.6
>            Reporter: issei yoshida
>         Attachments: MAHOUT-918.patch
>
>
> Implement SGD based classifiers (Logistic Regression, Adaptive Logistic regression and Passive-Aggressive) using MapReduce.
> They are implemented using Iterative Parameter Mixtures algorithm which is referred to in the following papers.
> http://research.google.com/pubs/pub36948.html
> http://aclweb.org/anthology-new/N/N10/N10-1069.pdf
> http://books.nips.cc/papers/files/nips22/NIPS2009_0345.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-918) Implement SGD based classifiers using MapReduce

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13167459#comment-13167459 ] 

jiraposter@reviews.apache.org commented on MAHOUT-918:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3072/
-----------------------------------------------------------

(Updated 2011-12-12 11:51:59.547649)


Review request for mahout.


Summary
-------

MAHOUT-918 Parallelized SGD in MapReduce


This addresses bug MAHOUT-918.
    https://issues.apache.org/jira/browse/MAHOUT-918


Diffs (updated)
-----

  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1213193 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java PRE-CREATION 
  trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java PRE-CREATION 
  trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION 
  trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION 
  trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION 
  trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION 

Diff: https://reviews.apache.org/r/3072/diff


Testing
-------


Thanks,

issei


                
> Implement SGD based classifiers using MapReduce
> -----------------------------------------------
>
>                 Key: MAHOUT-918
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-918
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Classification
>    Affects Versions: 0.6
>            Reporter: issei yoshida
>         Attachments: MAHOUT-918.patch, design.pdf
>
>
> Implement SGD based classifiers (Logistic Regression, Adaptive Logistic regression and Passive-Aggressive) using MapReduce.
> They are implemented using Iterative Parameter Mixtures algorithm which is referred to in the following papers.
> http://research.google.com/pubs/pub36948.html
> http://aclweb.org/anthology-new/N/N10/N10-1069.pdf
> http://books.nips.cc/papers/files/nips22/NIPS2009_0345.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAHOUT-918) Implement SGD based classifiers using MapReduce

Posted by "issei yoshida (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

issei yoshida updated MAHOUT-918:
---------------------------------

    Status: Patch Available  (was: Open)
    
> Implement SGD based classifiers using MapReduce
> -----------------------------------------------
>
>                 Key: MAHOUT-918
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-918
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Classification
>    Affects Versions: 0.6
>            Reporter: issei yoshida
>         Attachments: MAHOUT-918.patch
>
>
> Implement SGD based classifiers (Logistic Regression, Adaptive Logistic regression and Passive-Aggressive) using MapReduce.
> They are implemented using Iterative Parameter Mixtures algorithm which is referred to in the following papers.
> http://research.google.com/pubs/pub36948.html
> http://aclweb.org/anthology-new/N/N10/N10-1069.pdf
> http://books.nips.cc/papers/files/nips22/NIPS2009_0345.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-918) Implement SGD based classifiers using MapReduce

Posted by "Ted Dunning (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164561#comment-13164561 ] 

Ted Dunning commented on MAHOUT-918:
------------------------------------

Can you post this as a review board review.  There are lots of comments to be made.

At a high level, I note the following issues:

1) I don't see a design document.  You cite a few articles but you don't say what you are really doing.

2) Is map-reduce an appropriate approach here for model averaging?

3) How do you plan to deal with randomization of data order?

4) There are a number of style issues:

   a) you have loops that look like this:
{code}
          for (...) {
             if (something) {
                ... stuff ...
                continue;
             }
             ... other stuff ...
             break;
          }
{code}
This is slightly perverse and is akin to using goto statements.  Much better is this:
{code}

          for (...) {
             if (something) {
                ... stuff ...
             } else {
                ... other stuff ...
                break;
             }
          }
{code}

                
> Implement SGD based classifiers using MapReduce
> -----------------------------------------------
>
>                 Key: MAHOUT-918
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-918
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Classification
>    Affects Versions: 0.6
>            Reporter: issei yoshida
>         Attachments: MAHOUT-918.patch
>
>
> Implement SGD based classifiers (Logistic Regression, Adaptive Logistic regression and Passive-Aggressive) using MapReduce.
> They are implemented using Iterative Parameter Mixtures algorithm which is referred to in the following papers.
> http://research.google.com/pubs/pub36948.html
> http://aclweb.org/anthology-new/N/N10/N10-1069.pdf
> http://books.nips.cc/papers/files/nips22/NIPS2009_0345.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-918) Implement SGD based classifiers using MapReduce

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13168092#comment-13168092 ] 

jiraposter@reviews.apache.org commented on MAHOUT-918:
------------------------------------------------------



bq.  On 2011-12-08 07:04:49, Ted Dunning wrote:
bq.  >

This code got worse with these comments, not better.


- Ted


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3072/#review3734
-----------------------------------------------------------


On 2011-12-12 11:51:59, issei yoshida wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/3072/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-12-12 11:51:59)
bq.  
bq.  
bq.  Review request for mahout.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  MAHOUT-918 Parallelized SGD in MapReduce
bq.  
bq.  
bq.  This addresses bug MAHOUT-918.
bq.      https://issues.apache.org/jira/browse/MAHOUT-918
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1213193 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java PRE-CREATION 
bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION 
bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION 
bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION 
bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/3072/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  issei
bq.  
bq.


                
> Implement SGD based classifiers using MapReduce
> -----------------------------------------------
>
>                 Key: MAHOUT-918
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-918
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Classification
>    Affects Versions: 0.6
>            Reporter: issei yoshida
>         Attachments: MAHOUT-918.patch, design.pdf
>
>
> Implement SGD based classifiers (Logistic Regression, Adaptive Logistic regression and Passive-Aggressive) using MapReduce.
> They are implemented using Iterative Parameter Mixtures algorithm which is referred to in the following papers.
> http://research.google.com/pubs/pub36948.html
> http://aclweb.org/anthology-new/N/N10/N10-1069.pdf
> http://books.nips.cc/papers/files/nips22/NIPS2009_0345.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-918) Implement SGD based classifiers using MapReduce

Posted by "issei yoshida (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165049#comment-13165049 ] 

issei yoshida commented on MAHOUT-918:
--------------------------------------

I posted the code in the review board and attached a design document.
https://reviews.apache.org/r/3072/

>  Is map-reduce an appropriate approach here for model averaging?
MPI or other frameworks may produce a better result,
but the important thing is that MapReduce implementation is easy to use for Hadoop users.
Some iterative algorithms (K-means or other clustering algorithms) which are implemented in Mahout may not be best suitable for MapReduce, but it is not the point.

The papers show that Iterative Parameter Mixture is the best way to distribute SGD in MapReduce.

> How do you plan to deal with randomization of data order?
It may be possible to randomize data order by customizing InputFormat.
                
> Implement SGD based classifiers using MapReduce
> -----------------------------------------------
>
>                 Key: MAHOUT-918
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-918
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Classification
>    Affects Versions: 0.6
>            Reporter: issei yoshida
>         Attachments: MAHOUT-918.patch, design.pdf
>
>
> Implement SGD based classifiers (Logistic Regression, Adaptive Logistic regression and Passive-Aggressive) using MapReduce.
> They are implemented using Iterative Parameter Mixtures algorithm which is referred to in the following papers.
> http://research.google.com/pubs/pub36948.html
> http://aclweb.org/anthology-new/N/N10/N10-1069.pdf
> http://books.nips.cc/papers/files/nips22/NIPS2009_0345.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-918) Implement SGD based classifiers using MapReduce

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165051#comment-13165051 ] 

jiraposter@reviews.apache.org commented on MAHOUT-918:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3072/#review3734
-----------------------------------------------------------



trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java
<https://reviews.apache.org/r/3072/#comment8405>

    Needs a comment about how this works.



trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java
<https://reviews.apache.org/r/3072/#comment8406>

    This is nearly duplicated code.  The mapper and reducer should share some code to avoid inconsistent defaults.



trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java
<https://reviews.apache.org/r/3072/#comment8407>

    This really need a comment.  What is the purpose here?



trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java
<https://reviews.apache.org/r/3072/#comment8408>

    What is this intended to do?  Why?



trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java
<https://reviews.apache.org/r/3072/#comment8403>

    Typo.
    
    Also, this doesn't say how this works or why it is the way it is.



trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java
<https://reviews.apache.org/r/3072/#comment8404>

    Shouldn't there be a combiner as well?



trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java
<https://reviews.apache.org/r/3072/#comment8402>

    A comment here about what this weight is would be nice.  Also, how can a double be a key?  That is tantamount to comparing doubles which is bad.



trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java
<https://reviews.apache.org/r/3072/#comment8400>

    Where does the InterruptedException come from?



trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java
<https://reviews.apache.org/r/3072/#comment8399>

    Use brackets



trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java
<https://reviews.apache.org/r/3072/#comment8401>

    Should not throw Exception


- Ted


On 2011-12-08 06:52:01, issei yoshida wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/3072/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-12-08 06:52:01)
bq.  
bq.  
bq.  Review request for mahout.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  MAHOUT-918 Parallelized SGD in MapReduce
bq.  
bq.  
bq.  This addresses bug MAHOUT-918.
bq.      https://issues.apache.org/jira/browse/MAHOUT-918
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1211755 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION 
bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION 
bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION 
bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION 
bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/3072/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  issei
bq.  
bq.


                
> Implement SGD based classifiers using MapReduce
> -----------------------------------------------
>
>                 Key: MAHOUT-918
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-918
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Classification
>    Affects Versions: 0.6
>            Reporter: issei yoshida
>         Attachments: MAHOUT-918.patch, design.pdf
>
>
> Implement SGD based classifiers (Logistic Regression, Adaptive Logistic regression and Passive-Aggressive) using MapReduce.
> They are implemented using Iterative Parameter Mixtures algorithm which is referred to in the following papers.
> http://research.google.com/pubs/pub36948.html
> http://aclweb.org/anthology-new/N/N10/N10-1069.pdf
> http://books.nips.cc/papers/files/nips22/NIPS2009_0345.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-918) Implement SGD based classifiers using MapReduce

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13167462#comment-13167462 ] 

jiraposter@reviews.apache.org commented on MAHOUT-918:
------------------------------------------------------



bq.  On 2011-12-08 07:04:49, Ted Dunning wrote:
bq.  > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java, line 36
bq.  > <https://reviews.apache.org/r/3072/diff/2/?file=63195#file63195line36>
bq.  >
bq.  >     Needs a comment about how this works.

Added comments.


bq.  On 2011-12-08 07:04:49, Ted Dunning wrote:
bq.  > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java, lines 67-75
bq.  > <https://reviews.apache.org/r/3072/diff/2/?file=63195#file63195line67>
bq.  >
bq.  >     This really need a comment.  What is the purpose here?

Added comments.


bq.  On 2011-12-08 07:04:49, Ted Dunning wrote:
bq.  > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java, lines 98-111
bq.  > <https://reviews.apache.org/r/3072/diff/2/?file=63195#file63195line98>
bq.  >
bq.  >     What is this intended to do?  Why?

Added comments.


bq.  On 2011-12-08 07:04:49, Ted Dunning wrote:
bq.  > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java, line 30
bq.  > <https://reviews.apache.org/r/3072/diff/2/?file=63196#file63196line30>
bq.  >
bq.  >     Typo.
bq.  >     
bq.  >     Also, this doesn't say how this works or why it is the way it is.

Fixed the typo and added comments.


bq.  On 2011-12-08 07:04:49, Ted Dunning wrote:
bq.  > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java, line 32
bq.  > <https://reviews.apache.org/r/3072/diff/2/?file=63196#file63196line32>
bq.  >
bq.  >     Shouldn't there be a combiner as well?

A combiner isn't needed because each map task submits one value overall.


bq.  On 2011-12-08 07:04:49, Ted Dunning wrote:
bq.  > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java, line 53
bq.  > <https://reviews.apache.org/r/3072/diff/2/?file=63196#file63196line53>
bq.  >
bq.  >     A comment here about what this weight is would be nice.  Also, how can a double be a key?  That is tantamount to comparing doubles which is bad.

Added comments. it is not the weight of the classifier but the weight of the weighted average.


bq.  On 2011-12-08 07:04:49, Ted Dunning wrote:
bq.  > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java, line 99
bq.  > <https://reviews.apache.org/r/3072/diff/2/?file=63197#file63197line99>
bq.  >
bq.  >     Where does the InterruptedException come from?

It comes from runIteration function.


bq.  On 2011-12-08 07:04:49, Ted Dunning wrote:
bq.  > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java, lines 110-111
bq.  > <https://reviews.apache.org/r/3072/diff/2/?file=63197#file63197line110>
bq.  >
bq.  >     Use brackets

Added brackets.


bq.  On 2011-12-08 07:04:49, Ted Dunning wrote:
bq.  > trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java, line 35
bq.  > <https://reviews.apache.org/r/3072/diff/2/?file=63198#file63198line35>
bq.  >
bq.  >     Should not throw Exception

Added IO Exception and Interrupted Exception.


bq.  On 2011-12-08 07:04:49, Ted Dunning wrote:
bq.  > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java, lines 53-56
bq.  > <https://reviews.apache.org/r/3072/diff/2/?file=63195#file63195line53>
bq.  >
bq.  >     This is nearly duplicated code.  The mapper and reducer should share some code to avoid inconsistent defaults.

Created a base class which shares the same initialization code.


- issei


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3072/#review3734
-----------------------------------------------------------


On 2011-12-12 11:51:59, issei yoshida wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/3072/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-12-12 11:51:59)
bq.  
bq.  
bq.  Review request for mahout.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  MAHOUT-918 Parallelized SGD in MapReduce
bq.  
bq.  
bq.  This addresses bug MAHOUT-918.
bq.      https://issues.apache.org/jira/browse/MAHOUT-918
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1213193 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java PRE-CREATION 
bq.    trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java PRE-CREATION 
bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION 
bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION 
bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION 
bq.    trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/3072/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  issei
bq.  
bq.


                
> Implement SGD based classifiers using MapReduce
> -----------------------------------------------
>
>                 Key: MAHOUT-918
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-918
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Classification
>    Affects Versions: 0.6
>            Reporter: issei yoshida
>         Attachments: MAHOUT-918.patch, design.pdf
>
>
> Implement SGD based classifiers (Logistic Regression, Adaptive Logistic regression and Passive-Aggressive) using MapReduce.
> They are implemented using Iterative Parameter Mixtures algorithm which is referred to in the following papers.
> http://research.google.com/pubs/pub36948.html
> http://aclweb.org/anthology-new/N/N10/N10-1069.pdf
> http://books.nips.cc/papers/files/nips22/NIPS2009_0345.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira