You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "zhengruifeng (JIRA)" <ji...@apache.org> on 2015/04/21 10:31:58 UTC
[jira] [Commented] (SPARK-7008) An implementation of Factorization
Machine (LibFM)
[ https://issues.apache.org/jira/browse/SPARK-7008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504596#comment-14504596 ]
zhengruifeng commented on SPARK-7008:
-------------------------------------
I had not considered of the size of model, because the problems which I usualy encounter have dimensionality less than 10 millions. In the situation of higher dimensionality, I think feature hashing may help to limit the number of features (not sure).
The libFM had implemented four training algorithms: SGD, AdaptiveSGD, ALS and MCC. I have only implemented the SGD for regression, and I'm to carry out SGD for binary classification.
In my opinion, SGD is sensitive to the learning rate: big values cause divergency while small cause long-time training.
When coding, I strictly refers to LibFM. There are only two points different: LibFM use strict SGD, I use mini-batch SGD provided by MLlib; LibFM use Learning Rate as a constant, I make it decreasing with the square root of the iteration counter. So I think it's convergence may like LibFM's SGD.
I'm testing the library, and the result will be post in several days.
Thanks.
> An implementation of Factorization Machine (LibFM)
> --------------------------------------------------
>
> Key: SPARK-7008
> URL: https://issues.apache.org/jira/browse/SPARK-7008
> Project: Spark
> Issue Type: New Feature
> Components: MLlib
> Affects Versions: 1.3.0, 1.3.1, 1.3.2
> Reporter: zhengruifeng
> Labels: features, patch
> Attachments: FM_convergence_rate.xlsx, QQ20150421-1.png, QQ20150421-2.png
>
>
> An implement of Factorization Machines based on Scala and Spark MLlib.
> Factorization Machine is a kind of machine learning algorithm for multi-linear regression, and is widely used for recommendation.
> Factorization Machines works well in recent years' recommendation competitions.
> Ref:
> http://libfm.org/
> http://doi.acm.org/10.1145/2168752.2168771
> http://www.inf.uni-konstanz.de/~rendle/pdf/Rendle2010FM.pdf
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org