You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Michel Lemay (JIRA)" <ji...@apache.org> on 2018/01/25 13:48:00 UTC

[jira] [Created] (SPARK-23216) Multiclass LogisticRegression could have methods like NCE, NEG, Hierarchical SoftMax, Blackout or IS

Michel Lemay created SPARK-23216:
------------------------------------

             Summary: Multiclass LogisticRegression could have methods like NCE, NEG, Hierarchical SoftMax, Blackout or IS
                 Key: SPARK-23216
                 URL: https://issues.apache.org/jira/browse/SPARK-23216
             Project: Spark
          Issue Type: Improvement
          Components: ML, MLlib
    Affects Versions: 2.2.1
            Reporter: Michel Lemay


When training a classifier with large number of classes, performance sink. This is expected when using regular (log)SoftMax methods to compute the loss since it needs to normalize current class score with the sum of all other classes score.

I think this would be helpful to have approximate methods like Hierarchical SoftMax, NCE, NEG, IS to speedup training.

A paper comparing different methods for approximate normalization over all classes:
[http://web4.cs.ucl.ac.uk/staff/D.Barber/publications/AISTATS2017.pdf]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org