You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "christian sommeregger (JIRA)" <ji...@apache.org> on 2015/10/09 13:01:26 UTC

[jira] [Commented] (SPARK-2309) Generalize the binary logistic regression into multinomial logistic regression

    [ https://issues.apache.org/jira/browse/SPARK-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14950208#comment-14950208 ] 

christian sommeregger commented on SPARK-2309:
----------------------------------------------

Hey everybody! 
After inspecting the code on github I believe that we have not really 
implemented the standard multinomial problem from http://www.slideshare.net/dbtsai/2014-0620-mlor-36132297/25
but a model that covers a set of binary choices with item specific weights, which is a slightly different thing. 

For a true multinomial setup each row in the training data needs to containt all items (K = number of choices) that were available in a specific choice situation,
The current labelled point object however has just a choice flag + the respective features of one item in each row:

e.g.: Labelled point (K=1)
0 | (,,,,)  
1 | (,,,,)  
3 | (,,,,) 
3 | (,,,,) 
0 | (,,,,)  
....

For the model on http://www.slideshare.net/dbtsai/2014-0620-mlor-36132297/25 we would rather need the following structure. 

e.g.: Always three Items in the choice set (K=3)
Choice Indicator | Item1Features | Item2Features | Item3Features
1 | (,,,,) |  (,,,,) |  (,,,,) 
3 | (,,,,) |  (,,,,) |  (,,,,)* 
3 | (,,,,) |  (,,,,) |  (,,,,)*
.... 
e.g.: Flexible number of Items in the choice set (K varies)
8 | (,,,,) |  (,,,,) |  (,,,,) |  (,,,,) |  (,,,,) |  (,,,,) |  (,,,,) |  (,,,,)* |  (,,,,) 
2 | (,,,,) |  (,,,,)* |  (,,,,) 
3 | (,,,,) |  (,,,,) |  (,,,,)* |  (,,,,) 


> Generalize the binary logistic regression into multinomial logistic regression
> ------------------------------------------------------------------------------
>
>                 Key: SPARK-2309
>                 URL: https://issues.apache.org/jira/browse/SPARK-2309
>             Project: Spark
>          Issue Type: New Feature
>          Components: MLlib
>            Reporter: DB Tsai
>            Assignee: DB Tsai
>            Priority: Critical
>             Fix For: 1.3.0
>
>
> Currently, there is no multi-class classifier in mllib. Logistic regression can be extended to multinomial one straightforwardly. 
> The following formula will be implemented. 
> http://www.slideshare.net/dbtsai/2014-0620-mlor-36132297/25



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org