You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Michel Lemay (JIRA)" <ji...@apache.org> on 2018/01/26 13:03:02 UTC

[jira] [Commented] (SPARK-23216) Multiclass LogisticRegression could have methods like NCE, NEG, Hierarchical SoftMax, Blackout or IS

    [ https://issues.apache.org/jira/browse/SPARK-23216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16341019#comment-16341019 ] 

Michel Lemay commented on SPARK-23216:
--------------------------------------

Another interesting topic on the subject: Efficient softmax approximation for GPUs

"Our approach, called adaptive softmax, circumvents the linear dependency on the vocabulary size by exploiting the unbalanced word distribution to form clusters that explicitly minimize the expectation of computation time. "
[https://arxiv.org/pdf/1609.04309.pdf]

 

> Multiclass LogisticRegression could have methods like NCE, NEG, Hierarchical SoftMax, Blackout or IS
> ----------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-23216
>                 URL: https://issues.apache.org/jira/browse/SPARK-23216
>             Project: Spark
>          Issue Type: Improvement
>          Components: ML, MLlib
>    Affects Versions: 2.2.1
>            Reporter: Michel Lemay
>            Priority: Minor
>
> When training a classifier with large number of classes, performance sink. This is expected when using regular (log)SoftMax methods to compute the loss since it needs to normalize current class score with the sum of all other classes score.
> I think this would be helpful to have approximate methods like Hierarchical SoftMax, NCE, NEG, IS to speedup training.
> A paper comparing different methods for approximate normalization over all classes:
> [http://web4.cs.ucl.ac.uk/staff/D.Barber/publications/AISTATS2017.pdf]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org