You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "christian sommeregger (JIRA)" <ji...@apache.org> on 2015/10/09 13:01:26 UTC
[jira] [Commented] (SPARK-2309) Generalize the binary logistic
regression into multinomial logistic regression
[ https://issues.apache.org/jira/browse/SPARK-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14950208#comment-14950208 ]
christian sommeregger commented on SPARK-2309:
----------------------------------------------
Hey everybody!
After inspecting the code on github I believe that we have not really
implemented the standard multinomial problem from http://www.slideshare.net/dbtsai/2014-0620-mlor-36132297/25
but a model that covers a set of binary choices with item specific weights, which is a slightly different thing.
For a true multinomial setup each row in the training data needs to containt all items (K = number of choices) that were available in a specific choice situation,
The current labelled point object however has just a choice flag + the respective features of one item in each row:
e.g.: Labelled point (K=1)
0 | (,,,,)
1 | (,,,,)
3 | (,,,,)
3 | (,,,,)
0 | (,,,,)
....
For the model on http://www.slideshare.net/dbtsai/2014-0620-mlor-36132297/25 we would rather need the following structure.
e.g.: Always three Items in the choice set (K=3)
Choice Indicator | Item1Features | Item2Features | Item3Features
1 | (,,,,) | (,,,,) | (,,,,)
3 | (,,,,) | (,,,,) | (,,,,)*
3 | (,,,,) | (,,,,) | (,,,,)*
....
e.g.: Flexible number of Items in the choice set (K varies)
8 | (,,,,) | (,,,,) | (,,,,) | (,,,,) | (,,,,) | (,,,,) | (,,,,) | (,,,,)* | (,,,,)
2 | (,,,,) | (,,,,)* | (,,,,)
3 | (,,,,) | (,,,,) | (,,,,)* | (,,,,)
> Generalize the binary logistic regression into multinomial logistic regression
> ------------------------------------------------------------------------------
>
> Key: SPARK-2309
> URL: https://issues.apache.org/jira/browse/SPARK-2309
> Project: Spark
> Issue Type: New Feature
> Components: MLlib
> Reporter: DB Tsai
> Assignee: DB Tsai
> Priority: Critical
> Fix For: 1.3.0
>
>
> Currently, there is no multi-class classifier in mllib. Logistic regression can be extended to multinomial one straightforwardly.
> The following formula will be implemented.
> http://www.slideshare.net/dbtsai/2014-0620-mlor-36132297/25
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org